之前有过朋友问我Flask、Express这些框架是如何在函数中运行,他是怎么样的一个机制?还有人问我如何做一个Component?看了一下腾讯云Serverless架构现在支持的框架:
我发现虽然支持了很多,但是我比较钟爱的Django貌似没有,正好想到了部分人的疑惑,所以在这里,我就简单的和大家说一下,我如何做一个Django的Component。
分析已有Component(Flask为例)
首先第一步,我们要知道其他的框架是怎么运行的,例如Flask等,我们先通过腾讯云的Flask-Component,按照他的说明部署一下:
非常简单轻松愉快的部署上线,然后在函数的控制台,我们把部署好的下载下来,研究一下:
下载解压之后,我们可以看这样一个目录结构:
蓝色框起来的,是依赖包,黄色的app.py是我们的自己写的代码,那么红色圈起来的是什么?这两个文件从哪里出来的?
api_server.py文件内容:
import app # Replace with your actual application
import severless_wsgi
# If you need to send additional content types as text, add then directly
# to the whitelist:
#
# serverless_wsgi.TEXT_MIME_TYPES.append("application/custom+json")
def handler(event, context):
return severless_wsgi.handle_request(app.app, event, context)
可以看到,这里面是将我们创建的app.py文件引入,并且拿到了app这个对象,并且将event和context同时传递给severless_wsgi.py中的handle_reques方法中,那么问题来了,这个方法是什么?
这个方法内容好多......看着有点眼晕,但是,我们可以直接发现这一段代码:
这一段是什么呢?这一段实际上就是将我们拿到的参数(event和context)进行转换,转换之后统一environ中,然后接下来通过werkzeug这个依赖,将这个内容变成request对象,并且与我们刚才说的app对象一起调用from_app方法。获得到反馈:
并且按照API网关的响应集成的格式,将结果返回。
此时此刻,各位看官可能有点想法了,貌似有一丢丢灵感出现了,那么我们不妨看一下Flask/Django这些框架的实现原理:
通过这个简版的原理图,和我刚才说的内容,我们可以想到,实际上正常用的时候要通过web_server,进入到下一个环节,而我们云函数更多是一个函数,本不需要启动web server,所以我们就可以直接调用wsgi_app这个方法,其中这里的environ就是我们刚才的通过对event/context等进行处理后的对象,start_response可以认为是我们的一种特殊的数据结构,例如我们的response结构形态等。所以,如果我们自己想要实现这个过程,不使用腾讯云flask-component,可以这样做:
import sys
try:
from urllib import urlencode
except ImportError:
from urllib.parse import urlencode
from flask import Flask
try:
from cStringIO import StringIO
except ImportError:
try:
from StringIO import StringIO
except ImportError:
from io import StringIO
from werkzeug.wrappers import BaseRequest
__version__ = '0.0.4'
def make_environ(event):
environ = {}
for hdr_name, hdr_value in event['headers'].items():
hdr_name = hdr_name.replace('-', '_').upper()
if hdr_name in ['CONTENT_TYPE', 'CONTENT_LENGTH']:
environ[hdr_name] = hdr_value
continue
http_hdr_name = 'HTTP_%s' % hdr_name
environ[http_hdr_name] = hdr_value
apigateway_qs = event['queryStringParameters']
request_qs = event['queryString']
qs = apigateway_qs.copy()
qs.update(request_qs)
body = ''
if 'body' in event:
body = event['body']
environ['REQUEST_METHOD'] = event['httpMethod']
environ['PATH_INFO'] = event['path']
environ['QUERY_STRING'] = urlencode(qs) if qs else ''
environ['REMOTE_ADDR'] = 80
environ['HOST'] = event['headers']['host']
environ['SCRIPT_NAME'] = ''
environ['SERVER_PORT'] = 80
environ['SERVER_PROTOCOL'] = 'HTTP/1.1'
environ['CONTENT_LENGTH'] = str(len(body))
environ['wsgi.url_scheme'] = ''
environ['wsgi.input'] = StringIO(body)
environ['wsgi.version'] = (1, 0)
environ['wsgi.errors'] = sys.stderr
environ['wsgi.multithread'] = False
environ['wsgi.run_once'] = True
environ['wsgi.multiprocess'] = False
BaseRequest(environ)
return environ
class LambdaResponse(object):
def __init__(self):
self.status = None
self.response_headers = None
def start_response(self, status, response_headers, exc_info=None):
self.status = int(status[:3])
self.response_headers = dict(response_headers)
class FlaskLambda(Flask):
def __call__(self, event, context):
if 'httpMethod' not in event:
print('httpMethod not in event')
return super(FlaskLambda, self).__call__(event, context)
response = LambdaResponse()
body = next(self.wsgi_app(
make_environ(event),
response.start_response
))
return {
'statusCode': response.status,
'headers': response.response_headers,
'body': body
}
这样一个流程,就会变得更加简单,清楚。整个实现过程,可以认为是对web server部分进行了一种“截断”或者是“替换”:
这就是对Flask-Component的基本分析思路,那么按照这个思路,我们是否可以将Django框架部署上Serverless架构呢?那么Flask和Django有什么区别呢?我这里的区别特指的是在运行启动过程中。
拓展思路:实现Django-component
仔细想一下,貌似并没有区别,那么我们是不是可以直接用Flask这个转换逻辑,将flask的app替换成django的app呢?
把:
from flask import Flask
app = Flask(__name__)
替换成:
import os
from django.core.wsgi import get_wsgi_application
os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'mydjango.settings')
application = get_wsgi_application()
是否就能解决问题呢?
我们不妨试一下:
建立好Django项目,直接增加index.py:
# -*- coding: utf-8 -*-
import os
import sys
import base64
from werkzeug.datastructures import Headers, MultiDict
from werkzeug.wrappers import Response
from werkzeug.urls import url_encode, url_unquote
from werkzeug.http import HTTP_STATUS_CODES
from werkzeug._compat import BytesIO, string_types, to_bytes, wsgi_encoding_dance
import mydjango.wsgi
TEXT_MIME_TYPES = [
"application/json",
"application/javascript",
"application/xml",
"application/vnd.api+json",
"image/svg+xml",
]
def all_casings(input_string):
if not input_string:
yield ""
else:
first = input_string[:1]
if first.lower() == first.upper():
for sub_casing in all_casings(input_string[1:]):
yield first + sub_casing
else:
for sub_casing in all_casings(input_string[1:]):
yield first.lower() + sub_casing
yield first.upper() + sub_casing
def split_headers(headers):
"""
If there are multiple occurrences of headers, create case-mutated variations
in order to pass them through APIGW. This is a hack that's currently
needed. See: https://github.com/logandk/serverless-wsgi/issues/11
Source: https://github.com/Miserlou/Zappa/blob/master/zappa/middleware.py
"""
new_headers = {}
for key in headers.keys():
values = headers.get_all(key)
if len(values) > 1:
for value, casing in zip(values, all_casings(key)):
new_headers[casing] = value
elif len(values) == 1:
new_headers[key] = values[0]
return new_headers
def group_headers(headers):
new_headers = {}
for key in headers.keys():
new_headers[key] = headers.get_all(key)
return new_headers
def encode_query_string(event):
multi = event.get(u"multiValueQueryStringParameters")
if multi:
return url_encode(MultiDict((i, j) for i in multi for j in multi[i]))
else:
return url_encode(event.get(u"queryString") or {})
def handle_request(application, event, context):
if u"multiValueHeaders" in event:
headers = Headers(event["multiValueHeaders"])
else:
headers = Headers(event["headers"])
strip_stage_path = os.environ.get("STRIP_STAGE_PATH", "").lower().strip() in [
"yes",
"y",
"true",
"t",
"1",
]
if u"apigw.tencentcs.com" in headers.get(u"Host", u"") and not strip_stage_path:
script_name = "/{}".format(event["requestContext"].get(u"stage", ""))
else:
script_name = ""
path_info = event["path"]
base_path = os.environ.get("API_GATEWAY_BASE_PATH")
if base_path:
script_name = "/" + base_path
if path_info.startswith(script_name):
path_info = path_info[len(script_name) :] or "/"
if u"body" in event:
body = event[u"body"] or ""
else:
body = ""
if event.get("isBase64Encoded", False):
body = base64.b64decode(body)
if isinstance(body, string_types):
body = to_bytes(body, charset="utf-8")
environ = {
"CONTENT_LENGTH": str(len(body)),
"CONTENT_TYPE": headers.get(u"Content-Type", ""),
"PATH_INFO": url_unquote(path_info),
"QUERY_STRING": encode_query_string(event),
"REMOTE_ADDR": event["requestContext"]
.get(u"identity", {})
.get(u"sourceIp", ""),
"REMOTE_USER": event["requestContext"]
.get(u"authorizer", {})
.get(u"principalId", ""),
"REQUEST_METHOD": event["httpMethod"],
"SCRIPT_NAME": script_name,
"SERVER_NAME": headers.get(u"Host", "lambda"),
"SERVER_PORT": headers.get(u"X-Forwarded-Port", "80"),
"SERVER_PROTOCOL": "HTTP/1.1",
"wsgi.errors": sys.stderr,
"wsgi.input": BytesIO(body),
"wsgi.multiprocess": False,
"wsgi.multithread": False,
"wsgi.run_once": False,
"wsgi.url_scheme": headers.get(u"X-Forwarded-Proto", "http"),
"wsgi.version": (1, 0),
"serverless.authorizer": event["requestContext"].get(u"authorizer"),
"serverless.event": event,
"serverless.context": context,
# TODO: Deprecate the following entries, as they do not comply with the WSGI
# spec. For custom variables, the spec says:
#
# Finally, the environ dictionary may also contain server-defined variables.
# These variables should be named using only lower-case letters, numbers, dots,
# and underscores, and should be prefixed with a name that is unique to the
# defining server or gateway.
"API_GATEWAY_AUTHORIZER": event["requestContext"].get(u"authorizer"),
"event": event,
"context": context,
}
for key, value in environ.items():
if isinstance(value, string_types):
environ[key] = wsgi_encoding_dance(value)
for key, value in headers.items():
key = "HTTP_" + key.upper().replace("-", "_")
if key not in ("HTTP_CONTENT_TYPE", "HTTP_CONTENT_LENGTH"):
environ[key] = value
response = Response.from_app(application, environ)
returndict = {u"statusCode": response.status_code}
if u"multiValueHeaders" in event:
returndict["multiValueHeaders"] = group_headers(response.headers)
else:
returndict["headers"] = split_headers(response.headers)
if event.get("requestContext").get("elb"):
# If the request comes from ALB we need to add a status description
returndict["statusDescription"] = u"%d %s" % (
response.status_code,
HTTP_STATUS_CODES[response.status_code],
)
if response.data:
mimetype = response.mimetype or "text/plain"
if (
mimetype.startswith("text/") or mimetype in TEXT_MIME_TYPES
) and not response.headers.get("Content-Encoding", ""):
returndict["body"] = response.get_data(as_text=True)
returndict["isBase64Encoded"] = False
else:
returndict["body"] = base64.b64encode(response.data).decode("utf-8")
returndict["isBase64Encoded"] = True
return returndict
def main_handler(event, context):
return handle_request(mydjango.wsgi.application, event, context)
然后我们部署到函数上,看一下效果:
函数信息:
from django.shortcuts import render
from django.http import HttpResponse
from django.views.decorators.csrf import csrf_exempt
# Create your views here.
@csrf_exempt
def hello(request):
if request.method == "POST":
return HttpResponse("Hello world ! " + request.POST.get("name"))
if request.method == "GET":
return HttpResponse("Hello world ! " + request.GET.get("name"))
通过部署完成,并绑定apigw触发器,然后在postman中进行测试:
get:
post:
可以看到,通过我们对运行原理的基本剖析和对django的改造,我们已经通过增加一个文件和相关依赖的方法,实现了Django上Serverless的过程。
接下来,我们看一下,如何将这个代码写成一个Component:
首先Clone下来Flask-Component的代码:
然后,我们按照Django的部分模式进行修改:
第一部分,是我们可能会依赖的一个依赖包,以及我们刚才放入的index.py文件。在用户调用这个Component的时候,我们会把这两个文件,放入用户的代码中,一并上传。
第二部分是Serverless.js部分,这里的一个基本格式:
const { Component } = require('@serverless/core')
class TencentDjango extends Component {
async default(inputs = {}) {
}
async remove(inputs = {}) {
}
}
module.exports = TencentDjango
用户在执行sls的时候,会默认调用default的方法,在执行sls remove的时候会调用remove的方法,所以可以认default的内容是部署,而remove的内容是移除。
部署这里主要流程也蛮简单的,首先将文件进行复制和处理,然后直接调用云函数的组件,通过函数中的include参数将这些文件额外加入,再通过调用apigw的组件来进网关的管理,而用户写的yaml中inpust的内容,会在inputs中获取,我们要做的就是对应的传给不同的组件:
当然除了这两部分对应放过去,上面的region等一些信息也要对应的进行处理。而调用底层组件方法也很简单:
const tencentCloudFunction = await this.load('@serverless/tencent-scf'
const tencentCloudFunctionOutputs = await tencentCloudFunction(inputs)
处理好这里之后,只需要修改一下package.json和readme就可以了。
目前,我已经完成了开源:https://github.com/gosls/tencent-django
也在NPM上进行了发布:https://www.npmjs.com/package/@gosls/tencent-django
在使用的时候,只需要引入这个Component就好:
DjangoTest:
component: '@serverless/tencent-django'
inputs:
region: ap-guangzhou
functionName: DjangoFunctionTest
djangoProjectName: mydjango
code: ./
functionConf:
timeout: 10
memorySize: 256
environment:
variables:
TEST: vale
vpcConfig:
subnetId: ''
vpcId: ''
apigatewayConf:
protocols:
- http
environment: release
至此,完成了Django Component的开发和测试。