学flask有短时间了,一直想了解源码,最近看了大神的一篇博客分析的很透彻,竟然看懂了.现在也来分析下.
1. 提起Flask, 说下 WSGI:
了解了HTTP协议和HTML文档,我们其实就明白了一个Web应用的本质就是:
浏览器发送一个HTTP请求;
服务器收到请求,生成一个HTML文档;
服务器把HTML文档作为HTTP响应的Body发送给浏览器;
浏览器收到HTTP响应,从HTTP Body取出HTML文档并显示。
所以,最简单的Web应用就是先把HTML用文件保存好,用一个现成的HTTP服务器软件,接收用户请求,从文件中读取HTML,返回。Apache、Nginx、Lighttpd等这些常见的静态服务器就是干这件事情的。
如果要动态生成HTML,就需要把上述步骤自己来实现。不过,接受HTTP请求、解析HTTP请求、发送HTTP响应都是苦力活,如果我们自己来写这些底层代码,还没开始写动态HTML呢,就得花个把月去读HTTP规范。
正确的做法是底层代码由专门的服务器软件实现,我们用Python专注于生成HTML文档。因为我们不希望接触到TCP连接、HTTP原始请求和响应格式,所以,需要一个统一的接口,让我们专心用Python编写Web业务。
这个接口就是WSGI:Web Server Gateway Interface。
2. WSGI具体功能
wsgi可以起到接口作用, 前面对接服务器,后面对接app具体功能
WSGI接口定义非常简单,它只要求Web开发者实现一个函数,就可以响应HTTP请求。我们来看一个最简单的Web版本的“Hello, web!”:
def application(environ, start_response):
start_response('200 OK', [('Content-Type', 'text/html')])
return '<h1>Hello, web!</h1>'
上面的application()函数就是符合WSGI标准的一个HTTP处理函数,它接收两个参数:
environ:一个包含所有HTTP请求信息的
dict
对象;start_response:一个发送HTTP响应的函数
3. Flask和WSGI
from flask import Flask
app = Flask(__name__)
@app.route('/')
def index():
return 'Hello World!'
就这样flask实例就生成了
但是当调用app的时候,实际上调用了Flask的__call__方法, 这就是app工作的开始
Flask的__call__源码如下:
class Flask(_PackageBoundObject):
# 中间省略
def __call__(self, environ, start_response): # Flask实例的__call__方法
"""Shortcut for :attr:`wsgi_app`."""
return self.wsgi_app(environ, start_response)
所以当请求发送过来时,调用了__call__方法, 但实际上可以看到调用的是wsgi_app方法,同时传入参数 environ和sart_response
来看下wsgi_app怎么定义的:
def wsgi_app(self, environ, start_response):
"""The actual WSGI application. This is not implemented in
`__call__` so that middlewares can be applied without losing a
reference to the class. So instead of doing this::
app = MyMiddleware(app)
It's a better idea to do this instead::
app.wsgi_app = MyMiddleware(app.wsgi_app)
Then you still have the original application object around and
can continue to call methods on it.
.. versionchanged:: 0.7
The behavior of the before and after request callbacks was changed
under error conditions and a new callback was added that will
always execute at the end of the request, independent on if an
error occurred or not. See :ref:`callbacks-and-errors`.
:param environ: a WSGI environment
:param start_response: a callable accepting a status code,
a list of headers and an optional
exception context to start the response
"""
ctx = self.request_context(environ)
ctx.push()
error = None
try:
try:
response = self.full_dispatch_request()
except Exception as e:
error = e
response = self.handle_exception(e)
return response(environ, start_response)
finally:
if self.should_ignore_error(error):
error = None
ctx.auto_pop(error)
第一步, 生成request请求对象和请求上下文环境 :
可以看到ctx=self.request_context(environ), 设计到了请求上下文和应用上下文的概念, 结构为栈结构,拥有栈的特点.
简单理解为生成了一个request请求对象以及包含请求信息在内的request_context
第二步, 请求预处理, 错误处理以及请求到响应的过程:
response = self.full_dispatch_request()
响应被赋值了成full_dispatch_request(), 所以看下full_dispatch_request()方法
def full_dispatch_request(self):
"""Dispatches the request and on top of that performs request
pre and postprocessing as well as HTTP exception catching and
error handling.
.. versionadded:: 0.7
"""
self.try_trigger_before_first_request_functions() # 进行请求前的一些处理, 类似中关键
try:
request_started.send(self) # socket的操作
rv = self.preprocess_request() # 进行请求预处理
if rv is None:
rv = self.dispatch_request()
except Exception as e:
rv = self.handle_user_exception(e)
return self.finalize_request(rv)
先看下try_trigger_before_first_request_functions(), 最终目的是将_got_first_request设为True, 如果是True,就开始处理请求了
def try_trigger_before_first_request_functions(self):
"""Called before each request and will ensure that it triggers
the :attr:`before_first_request_funcs` and only exactly once per
application instance (which means process usually).
:internal:
"""
if self._got_first_request:
return
with self._before_request_lock:
if self._got_first_request:
return
for func in self.before_first_request_funcs:
func()
self._got_first_request = True
got_first_request()定义为静态方法, 定义中可以看到if the application started, this attribute is set to True.
class Flask(_PackageBoundObject):
# 省略...
@property
def got_first_request(self):
"""This attribute is set to ``True`` if the application started
handling the first request.
.. versionadded:: 0.8
"""
return self._got_first_request
回到full_dispatch_request(), 看一下preprocess_request()方法, 也就是flask的钩子,相当于中间键, 可以实现before_request功能.
try:
request_started.send(self)
rv = self.preprocess_request()
if rv is None:
rv = self.dispatch_request()
except Exception as e:
rv = self.handle_user_exception(e)
对于dispatch_request()方法, 起到分发请求的作用, 一个请求通过url寄来以后,app怎么知道如何响应呢?就是通过这个方法.
第三步, 请求分发 dispatch_request :
def dispatch_request(self):
"""Does the request dispatching. Matches the URL and returns the
return value of the view or error handler. This does not have to
be a response object. In order to convert the return value to a
proper response object, call :func:`make_response`.
.. versionchanged:: 0.7
This no longer does the exception handling, this code was
moved to the new :meth:`full_dispatch_request`.
"""
req = _request_ctx_stack.top.request # 将栈环境中的请求复制给req
if req.routing_exception is not None:
self.raise_routing_exception(req)
rule = req.url_rule
# if we provide automatic options for this URL and the
# request came with the OPTIONS method, reply automatically
if getattr(rule, 'provide_automatic_options', False) \
and req.method == 'OPTIONS':
return self.make_default_options_response()
# otherwise dispatch to the handler for that endpoint
return self.view_functions[rule.endpoint](**req.view_args)
这一步主要作用就是将@app.route('/')中的'/'和index函数对应起来,具体分析还是挺麻烦的,至少我没搞懂.
接下来full_dispatch_request()通过make_response()将rv生成响应, 赋值给response.
那make_response()是如何做到的呢, 看源码:
def make_response(self, rv):
"""Converts the return value from a view function to a real
response object that is an instance of :attr:`response_class`.
The following types are allowed for `rv`:
.. tabularcolumns:: |p{3.5cm}|p{9.5cm}|
======================= ===========================================
:attr:`response_class` the object is returned unchanged
:class:`str` a response object is created with the
string as body
:class:`unicode` a response object is created with the
string encoded to utf-8 as body
a WSGI function the function is called as WSGI application
and buffered as response object
:class:`tuple` A tuple in the form ``(response, status,
headers)`` or ``(response, headers)``
where `response` is any of the
types defined here, `status` is a string
or an integer and `headers` is a list or
a dictionary with header values.
======================= ===========================================
:param rv: the return value from the view function
.. versionchanged:: 0.9
Previously a tuple was interpreted as the arguments for the
response object.
"""
status_or_headers = headers = None
if isinstance(rv, tuple):
rv, status_or_headers, headers = rv + (None,) * (3 - len(rv))
if rv is None:
raise ValueError('View function did not return a response')
if isinstance(status_or_headers, (dict, list)):
headers, status_or_headers = status_or_headers, None
if not isinstance(rv, self.response_class):
# When we create a response object directly, we let the constructor
# set the headers and status. We do this because there can be
# some extra logic involved when creating these objects with
# specific values (like default content type selection).
if isinstance(rv, (text_type, bytes, bytearray)):
rv = self.response_class(rv, headers=headers,
status=status_or_headers)
headers = status_or_headers = None
else:
rv = self.response_class.force_type(rv, request.environ)
if status_or_headers is not None:
if isinstance(status_or_headers, string_types):
rv.status = status_or_headers
else:
rv.status_code = status_or_headers
if headers:
rv.headers.extend(headers)
return rv
第四步, 返回到wsgi_app内部:
def wsgi_app(self, environ, start_response):
"""The actual WSGI application. This is not implemented in
`__call__` so that middlewares can be applied without losing a
reference to the class. So instead of doing this::
app = MyMiddleware(app)
It's a better idea to do this instead::
app.wsgi_app = MyMiddleware(app.wsgi_app)
Then you still have the original application object around and
can continue to call methods on it.
.. versionchanged:: 0.7
The behavior of the before and after request callbacks was changed
under error conditions and a new callback was added that will
always execute at the end of the request, independent on if an
error occurred or not. See :ref:`callbacks-and-errors`.
:param environ: a WSGI environment
:param start_response: a callable accepting a status code,
a list of headers and an optional
exception context to start the response
"""
ctx = self.request_context(environ)
ctx.push()
error = None
try:
try:
response = self.full_dispatch_request()
except Exception as e:
error = e
response = self.handle_exception(e)
return response(environ, start_response)
finally:
if self.should_ignore_error(error):
error = None
ctx.auto_pop(error)
就这样response从full_dispatch_request()中得到后, 传入参数environ和start_response, 返回给Gunicorn了.
从HTTP request到response的流程就完毕了.
梳理下流程:
客户端-----> wsgi server ----> 通过__call__调用 wsgi_app, 生成requests对象和上下文环境------> full_dispatch_request功能 ---->通过 dispatch_requests进行url到view function的逻辑转发, 并取得返回值 ------> 通过make_response函数,将一个view_function的返回值转换成一个response_class对象------->通过向response对象传入environ和start_response参数, 将最终响应返回给服务器.
一个人看完源码真的不容易, 没有点功底确实难做到, 但是坚持下来是不是理解就更深了? 加油, 送给每一位看到这里的程序猿们...共勉