大师兄的Python源码学习笔记(十九): 虚拟机中的函数机制(一)
大师兄的Python源码学习笔记(二十一): 虚拟机中的函数机制(三)
三、函数参数的实现
1.1 参数类型
- 在Python中,参数分为四种类型:
类型 | 示例 |
---|---|
位置参数(positional argument) | f(a) |
键参数(key argument) | f(k=v) |
扩展位置参数(excess positional argument) | f(*args) |
扩展键参数(excess key argument) | f(*kwargs) |
- 其中扩展位置参数是位置参数的更高级形式,扩展键参数是键参数的更高级形式。
- 回顾call_function,指令参数oparg记录了参数个数信息,变量nkwargs计算了键参数的个数,变量nargs计算了位置参数的个数。
ceval.c
Py_LOCAL_INLINE(PyObject *) _Py_HOT_FUNCTION
call_function(PyObject ***pp_stack, Py_ssize_t oparg, PyObject *kwnames)
{
PyObject **pfunc = (*pp_stack) - oparg - 1;
PyObject *func = *pfunc;
PyObject *x, *w;
Py_ssize_t nkwargs = (kwnames == NULL) ? 0 : PyTuple_GET_SIZE(kwnames);
Py_ssize_t nargs = oparg - nkwargs;
PyObject **stack = (*pp_stack) - nargs - nkwargs;
... ...
- Py_ssize_t类型在64位系统中占8个字节,意味着最大可以有2^64个参数。
1.1.1 位置参数
def f(a):
...
f(1)
1 0 LOAD_CONST 0 (<code object f at 0x000001D063E40F60, file "demo.py", line 1>)
2 LOAD_CONST 1 ('f')
4 MAKE_FUNCTION 0
6 STORE_NAME 0 (f)
4 8 LOAD_NAME 0 (f)
10 LOAD_CONST 2 (1)
12 CALL_FUNCTION 1
14 POP_TOP
16 LOAD_CONST 3 (None)
18 RETURN_VALUE
consts
code
argcount 1
nlocals 1
stacksize 1
flags 0043
2 0 LOAD_CONST 0 (None)
2 RETURN_VALUE
1.1.2 键参数
def f(k=2):
...
f(k=1)
1 0 LOAD_CONST 6 ((2,))
2 LOAD_CONST 1 (<code object f at 0x000001A15E0A0F60, file "demo.py", line 1>)
4 LOAD_CONST 2 ('f')
6 MAKE_FUNCTION 1
8 STORE_NAME 0 (f)
4 10 LOAD_NAME 0 (f)
12 LOAD_CONST 3 (1)
14 LOAD_CONST 4 (('k',))
16 CALL_FUNCTION_KW 1
18 POP_TOP
20 LOAD_CONST 5 (None)
22 RETURN_VALUE
consts
2
code
argcount 1
nlocals 1
stacksize 1
flags 0043
2 0 LOAD_CONST 0 (None)
2 RETURN_VALUE
1.1.3 扩展位置参数
def f(*args):
...
f(1,2,3)
1 0 LOAD_CONST 0 (<code object f at 0x0000017458790F60, file "demo.py", line 1>)
2 LOAD_CONST 1 ('f')
4 MAKE_FUNCTION 0
6 STORE_NAME 0 (f)
4 8 LOAD_NAME 0 (f)
10 LOAD_CONST 2 (1)
12 LOAD_CONST 3 (2)
14 LOAD_CONST 4 (3)
16 CALL_FUNCTION 3
18 POP_TOP
20 LOAD_CONST 5 (None)
22 RETURN_VALUE
consts
code
argcount 0
nlocals 1
stacksize 1
flags 0047
2 0 LOAD_CONST 0 (None)
2 RETURN_VALUE
1.1.4 扩展键参数
def f(**kwargs):
...
f(a=1,b=2,c=3)
1 0 LOAD_CONST 0 (<code object f at 0x0000012F22D80F60, file "demo.py", line 1>)
2 LOAD_CONST 1 ('f')
4 MAKE_FUNCTION 0
6 STORE_NAME 0 (f)
4 8 LOAD_NAME 0 (f)
10 LOAD_CONST 2 (1)
12 LOAD_CONST 3 (2)
14 LOAD_CONST 4 (3)
16 LOAD_CONST 5 (('a', 'b', 'c'))
18 CALL_FUNCTION_KW 3
20 POP_TOP
22 LOAD_CONST 6 (None)
24 RETURN_VALUE
consts
code
argcount 0
nlocals 1
stacksize 1
flags 004b
2 0 LOAD_CONST 0 (None)
2 RETURN_VALUE
1.2 位置参数的传递
demo.py
def f(a):
a+=1
f(1)
1 0 LOAD_CONST 0 (<code object f at 0x00000170A8ED7A50, file "demo.py", line 1>)
2 LOAD_CONST 1 ('f')
4 MAKE_FUNCTION 0
6 STORE_NAME 0 (f)
3 8 LOAD_NAME 0 (f)
10 LOAD_CONST 2 (1)
12 CALL_FUNCTION 1
14 POP_TOP
16 LOAD_CONST 3 (None)
18 RETURN_VALUE
consts
code
argcount 1
nlocals 1
stacksize 2
flags 0043
2 0 LOAD_FAST 0 (a)
2 LOAD_CONST 1 (1)
4 INPLACE_ADD
6 STORE_FAST 0 (a)
8 LOAD_CONST 0 (None)
10 RETURN_VALUE
- 在CALL_FUNCTION之前,除了LOAD_NAME,还有一行LOAD_CONST将需要的参数压入运行时栈中。
ceval.c
Py_LOCAL_INLINE(PyObject *) _Py_HOT_FUNCTION
call_function(PyObject ***pp_stack, Py_ssize_t oparg, PyObject *kwnames)
{
PyObject **pfunc = (*pp_stack) - oparg - 1;
PyObject *func = *pfunc;
PyObject *x, *w;
Py_ssize_t nkwargs = (kwnames == NULL) ? 0 : PyTuple_GET_SIZE(kwnames);
Py_ssize_t nargs = oparg - nkwargs;
PyObject **stack = (*pp_stack) - nargs - nkwargs;
/* Always dispatch PyCFunction first, because these are
presumed to be the most frequent callable object.
*/
if (PyCFunction_Check(func)) {
PyThreadState *tstate = PyThreadState_GET();
C_TRACE(x, _PyCFunction_FastCallKeywords(func, stack, nargs, kwnames));
}
... ...
- 在call_function中,oparg表示参数的个数,**pfunc从栈顶pp_stack回退后获得PyFunctionObject对象。
- 而nkwargs和nargs分别计算出了键参数和位置参数的个数。
- 在随后的_PyCFunction_FastCallKeywords中,将nargs作为参数传入,用于获取位置参数在运行时栈中的个数和位置。
- 而在创建PyFrameObject时,会将函数的参数保存在f_localsplus中。
Objects\frameobject.c
PyFrameObject* _Py_HOT_FUNCTION
_PyFrame_New_NoTrack(PyThreadState *tstate, PyCodeObject *code,
PyObject *globals, PyObject *locals)
{
PyFrameObject *back = tstate->frame;
PyFrameObject *f;
PyObject *builtins;
Py_ssize_t i;
... ...
extras = code->co_nlocals + ncells + nfrees;
f->f_valuestack = f->f_localsplus + extras;
for (i=0; i<extras; i++)
f->f_localsplus[i] = NULL;
f->f_locals = NULL;
f->f_trace = NULL;
}
... ...
return f;
}
1.3 位置参数的访问
- 当访问参数时,调用参数的指令是LOAD_FAST,对应虚拟机源码如下:
ceval.c
TARGET(LOAD_FAST) {
PyObject *value = GETLOCAL(oparg);
if (value == NULL) {
format_exc_check_arg(PyExc_UnboundLocalError,
UNBOUNDLOCAL_ERROR_MSG,
PyTuple_GetItem(co->co_varnames, oparg));
goto error;
}
Py_INCREF(value);
PUSH(value);
FAST_DISPATCH();
}
- GETLOCAL宏将f_localsplus[1]中的对象也就是参数,压入到运行时栈中。
#define GETLOCAL(i) (fastlocals[i])
- 而STORE_FAST指令实现了f_localplus[1]也就是参数的更新。
ceval.c
PREDICTED(STORE_FAST);
TARGET(STORE_FAST) {
PyObject *value = POP();
SETLOCAL(oparg, value);
FAST_DISPATCH();
}
#define SETLOCAL(i, value) do { PyObject *tmp = GETLOCAL(i); \
GETLOCAL(i) = value; \
Py_XDECREF(tmp); } while (0)
- 综上所述,Python将函数参数值从左到右压入运行时栈中,并将这些参数依次复制到PyFrameObject对象的f_localsplus中。
- 在访问参数时,虚拟机直接通过一个索引来访问f_localsplus中存储的符号对应的值对象。
1.4 位置参数的默认值
demo.py
def f(a=1):
a+=1
f()
1 0 LOAD_CONST 4 ((1,))
2 LOAD_CONST 1 (<code object f at 0x000001B4C0FC7A50, file "demo.py", line 1>)
4 LOAD_CONST 2 ('f')
6 MAKE_FUNCTION 1
8 STORE_NAME 0 (f)
3 10 LOAD_NAME 0 (f)
12 CALL_FUNCTION 0
14 POP_TOP
16 LOAD_CONST 3 (None)
18 RETURN_VALUE
consts
1
code
argcount 1
nlocals 1
stacksize 2
flags 0043
2 0 LOAD_FAST 0 (a)
2 LOAD_CONST 1 (1)
4 INPLACE_ADD
6 STORE_FAST 0 (a)
8 LOAD_CONST 0 (None)
10 RETURN_VALUE
- 通过观察可以发现,在函数f之前,多了一条LOAD_CONST将默认参数先压入了运行时栈。
- 观察MAKE_FUNCTION的过程,默认参数最终会从运行时栈弹出,全部塞入到一个PyTupleObject对象中,并通过PyFunction_SetDefaults塞入到PyFunctionObject中。
ceval.c
TARGET(MAKE_FUNCTION) {
PyObject *qualname = POP();
PyObject *codeobj = POP();
PyFunctionObject *func = (PyFunctionObject *)
PyFunction_NewWithQualName(codeobj, f->f_globals, qualname);
Py_DECREF(codeobj);
Py_DECREF(qualname);
if (func == NULL) {
goto error;
}
if (oparg & 0x08) {
assert(PyTuple_CheckExact(TOP()));
func ->func_closure = POP();
}
if (oparg & 0x04) {
assert(PyDict_CheckExact(TOP()));
func->func_annotations = POP();
}
if (oparg & 0x02) {
assert(PyDict_CheckExact(TOP()));
func->func_kwdefaults = POP();
}
if (oparg & 0x01) {
assert(PyTuple_CheckExact(TOP()));
func->func_defaults = POP();
}
PUSH((PyObject *)func);
DISPATCH();
}
Objects\funcobject.c
int
PyFunction_SetDefaults(PyObject *op, PyObject *defaults)
{
if (!PyFunction_Check(op)) {
PyErr_BadInternalCall();
return -1;
}
if (defaults == Py_None)
defaults = NULL;
else if (defaults && PyTuple_Check(defaults)) {
Py_INCREF(defaults);
}
else {
PyErr_SetString(PyExc_SystemError, "non-tuple default args");
return -1;
}
Py_XSETREF(((PyFunctionObject *)op)->func_defaults, defaults);
return 0;
}
1.5 扩展位置参数和扩展键参数
- 扩展位置参数*list和扩展键参数**key实际是作为局部变量来实现的。
- 在Python内部,*list由PyTupleObject对象实现,而**key由PyDictObject对象实现。
demo.py
def f(*args,**kwargs):
print(f"args:{args}")
print(f"kwargs:{kwargs}")
f(1,2,3,a=4,b=5)
1 0 LOAD_CONST 0 (<code object f at 0x0000026837138A50, file "G:\learn_core\demo.py", line 1>)
2 LOAD_CONST 1 ('f')
4 MAKE_FUNCTION 0
6 STORE_NAME 0 (f)
4 8 LOAD_NAME 0 (f)
10 LOAD_CONST 2 (1)
12 LOAD_CONST 3 (2)
14 LOAD_CONST 4 (3)
16 LOAD_CONST 5 (4)
18 LOAD_CONST 6 (5)
20 LOAD_CONST 7 (('a', 'b'))
22 CALL_FUNCTION_KW 5
24 POP_TOP
26 LOAD_CONST 8 (None)
28 RETURN_VALUE
consts
code
argcount 0
nlocals 2
stacksize 3
flags 004f
2 0 LOAD_GLOBAL 0 (print)
2 LOAD_CONST 1 ('args:')
4 LOAD_FAST 0 (args)
6 FORMAT_VALUE 0
8 BUILD_STRING 2
10 CALL_FUNCTION 1
12 POP_TOP
3 14 LOAD_GLOBAL 0 (print)
16 LOAD_CONST 2 ('kwargs:')
18 LOAD_FAST 1 (kwargs)
20 FORMAT_VALUE 0
22 BUILD_STRING 2
24 CALL_FUNCTION 1
26 POP_TOP
28 LOAD_CONST 0 (None)
30 RETURN_VALUE
- 虚拟机最终将进入PyEval_EvalCodeEx:
ceval.c
PyObject *
PyEval_EvalCodeEx(PyObject *_co, PyObject *globals, PyObject *locals,
PyObject *const *args, int argcount,
PyObject *const *kws, int kwcount,
PyObject *const *defs, int defcount,
PyObject *kwdefs, PyObject *closure)
{
return _PyEval_EvalCodeWithName(_co, globals, locals,
args, argcount,
kws, kws != NULL ? kws + 1 : NULL,
kwcount, 2,
defs, defcount,
kwdefs, closure,
NULL, NULL);
}
ceval.c
PyObject *
_PyEval_EvalCodeWithName(PyObject *_co, PyObject *globals, PyObject *locals,
PyObject *const *args, Py_ssize_t argcount,
PyObject *const *kwnames, PyObject *const *kwargs,
Py_ssize_t kwcount, int kwstep,
PyObject *const *defs, Py_ssize_t defcount,
PyObject *kwdefs, PyObject *closure,
PyObject *name, PyObject *qualname)
{
PyCodeObject* co = (PyCodeObject*)_co;
PyFrameObject *f;
PyObject *retval = NULL;
PyObject **fastlocals, **freevars;
PyThreadState *tstate;
PyObject *x, *u;
const Py_ssize_t total_args = co->co_argcount + co->co_kwonlyargcount;
Py_ssize_t i, n;
PyObject *kwdict;
if (globals == NULL) {
PyErr_SetString(PyExc_SystemError,
"PyEval_EvalCodeEx: NULL globals");
return NULL;
}
/* Create the frame */
tstate = PyThreadState_GET();
assert(tstate != NULL);
f = _PyFrame_New_NoTrack(tstate, co, globals, locals);
if (f == NULL) {
return NULL;
}
fastlocals = f->f_localsplus;
freevars = f->f_localsplus + co->co_nlocals;
/* Create a dictionary for keyword parameters (**kwags) */
if (co->co_flags & CO_VARKEYWORDS) {
kwdict = PyDict_New();
if (kwdict == NULL)
goto fail;
i = total_args;
if (co->co_flags & CO_VARARGS) {
i++;
}
SETLOCAL(i, kwdict);
}
else {
kwdict = NULL;
}
/* Copy positional arguments into local variables */
if (argcount > co->co_argcount) {
n = co->co_argcount;
}
else {
n = argcount;
}
for (i = 0; i < n; i++) {
x = args[i];
Py_INCREF(x);
SETLOCAL(i, x);
}
/* Pack other positional arguments into the *args argument */
if (co->co_flags & CO_VARARGS) {
u = PyTuple_New(argcount - n);
if (u == NULL) {
goto fail;
}
SETLOCAL(total_args, u);
for (i = n; i < argcount; i++) {
x = args[i];
Py_INCREF(x);
PyTuple_SET_ITEM(u, i-n, x);
}
}
/* Handle keyword arguments passed as two strided arrays */
kwcount *= kwstep;
for (i = 0; i < kwcount; i += kwstep) {
PyObject **co_varnames;
PyObject *keyword = kwnames[i];
PyObject *value = kwargs[i];
Py_ssize_t j;
if (keyword == NULL || !PyUnicode_Check(keyword)) {
PyErr_Format(PyExc_TypeError,
"%U() keywords must be strings",
co->co_name);
goto fail;
}
/* Speed hack: do raw pointer compares. As names are
normally interned this should almost always hit. */
co_varnames = ((PyTupleObject *)(co->co_varnames))->ob_item;
for (j = 0; j < total_args; j++) {
PyObject *name = co_varnames[j];
if (name == keyword) {
goto kw_found;
}
}
/* Slow fallback, just in case */
for (j = 0; j < total_args; j++) {
PyObject *name = co_varnames[j];
int cmp = PyObject_RichCompareBool( keyword, name, Py_EQ);
if (cmp > 0) {
goto kw_found;
}
else if (cmp < 0) {
goto fail;
}
}
assert(j >= total_args);
if (kwdict == NULL) {
PyErr_Format(PyExc_TypeError,
"%U() got an unexpected keyword argument '%S'",
co->co_name, keyword);
goto fail;
}
if (PyDict_SetItem(kwdict, keyword, value) == -1) {
goto fail;
}
continue;
kw_found:
if (GETLOCAL(j) != NULL) {
PyErr_Format(PyExc_TypeError,
"%U() got multiple values for argument '%S'",
co->co_name, keyword);
goto fail;
}
Py_INCREF(value);
SETLOCAL(j, value);
}
/* Check the number of positional arguments */
if (argcount > co->co_argcount && !(co->co_flags & CO_VARARGS)) {
too_many_positional(co, argcount, defcount, fastlocals);
goto fail;
}
/* Add missing positional arguments (copy default values from defs) */
if (argcount < co->co_argcount) {
Py_ssize_t m = co->co_argcount - defcount;
Py_ssize_t missing = 0;
for (i = argcount; i < m; i++) {
if (GETLOCAL(i) == NULL) {
missing++;
}
}
if (missing) {
missing_arguments(co, missing, defcount, fastlocals);
goto fail;
}
if (n > m)
i = n - m;
else
i = 0;
for (; i < defcount; i++) {
if (GETLOCAL(m+i) == NULL) {
PyObject *def = defs[i];
Py_INCREF(def);
SETLOCAL(m+i, def);
}
}
}
/* Add missing keyword arguments (copy default values from kwdefs) */
if (co->co_kwonlyargcount > 0) {
Py_ssize_t missing = 0;
for (i = co->co_argcount; i < total_args; i++) {
PyObject *name;
if (GETLOCAL(i) != NULL)
continue;
name = PyTuple_GET_ITEM(co->co_varnames, i);
if (kwdefs != NULL) {
PyObject *def = PyDict_GetItem(kwdefs, name);
if (def) {
Py_INCREF(def);
SETLOCAL(i, def);
continue;
}
}
missing++;
}
if (missing) {
missing_arguments(co, missing, -1, fastlocals);
goto fail;
}
}
/* Allocate and initialize storage for cell vars, and copy free
vars into frame. */
for (i = 0; i < PyTuple_GET_SIZE(co->co_cellvars); ++i) {
PyObject *c;
Py_ssize_t arg;
/* Possibly account for the cell variable being an argument. */
if (co->co_cell2arg != NULL &&
(arg = co->co_cell2arg[i]) != CO_CELL_NOT_AN_ARG) {
c = PyCell_New(GETLOCAL(arg));
/* Clear the local copy. */
SETLOCAL(arg, NULL);
}
else {
c = PyCell_New(NULL);
}
if (c == NULL)
goto fail;
SETLOCAL(co->co_nlocals + i, c);
}
/* Copy closure variables to free variables */
for (i = 0; i < PyTuple_GET_SIZE(co->co_freevars); ++i) {
PyObject *o = PyTuple_GET_ITEM(closure, i);
Py_INCREF(o);
freevars[PyTuple_GET_SIZE(co->co_cellvars) + i] = o;
}
/* Handle generator/coroutine/asynchronous generator */
if (co->co_flags & (CO_GENERATOR | CO_COROUTINE | CO_ASYNC_GENERATOR)) {
PyObject *gen;
PyObject *coro_wrapper = tstate->coroutine_wrapper;
int is_coro = co->co_flags & CO_COROUTINE;
if (is_coro && tstate->in_coroutine_wrapper) {
assert(coro_wrapper != NULL);
PyErr_Format(PyExc_RuntimeError,
"coroutine wrapper %.200R attempted "
"to recursively wrap %.200R",
coro_wrapper,
co);
goto fail;
}
/* Don't need to keep the reference to f_back, it will be set
* when the generator is resumed. */
Py_CLEAR(f->f_back);
/* Create a new generator that owns the ready to run frame
* and return that as the value. */
if (is_coro) {
gen = PyCoro_New(f, name, qualname);
} else if (co->co_flags & CO_ASYNC_GENERATOR) {
gen = PyAsyncGen_New(f, name, qualname);
} else {
gen = PyGen_NewWithQualName(f, name, qualname);
}
if (gen == NULL) {
return NULL;
}
_PyObject_GC_TRACK(f);
if (is_coro && coro_wrapper != NULL) {
PyObject *wrapped;
tstate->in_coroutine_wrapper = 1;
wrapped = PyObject_CallFunction(coro_wrapper, "N", gen);
tstate->in_coroutine_wrapper = 0;
return wrapped;
}
return gen;
}
retval = PyEval_EvalFrameEx(f,0);
fail: /* Jump here from prelude on failure */
/* decref'ing the frame can cause __del__ methods to get invoked,
which can call back into Python. While we're done with the
current Python frame (f), the associated C stack is still in use,
so recursion_depth must be boosted for the duration.
*/
assert(tstate != NULL);
if (Py_REFCNT(f) > 1) {
Py_DECREF(f);
_PyObject_GC_TRACK(f);
}
else {
++tstate->recursion_depth;
Py_DECREF(f);
--tstate->recursion_depth;
}
return retval;
}
- 当Python编译函数时,如果在其形式参数中发现了*list这样的扩展位置参数的参数形式,那么会在PyCodeObject对象的co_flags中添加一个标识符号CO_VARARGS,表示该函数在被调用时需要处理扩展位置参数。
- 同样,如果发现**key这样的参数的函数,将在co_flags中添加CO_VARKEYWORDS,表示需要处理扩展键参数。
- 虚拟机会创建一个PyTupleObject对象,并将所有的扩展位置参数塞进这个PyTupleObject,之后,虚拟机会通过SETLOCAL将对象放到PyFrameObject对象的f_localsplus中,且放置的位置是co->co_argcount。
- 和扩展位置参数一样,虚拟机会创建一个PyDictObject对象,并放到f_localsplus中。
- 对调用参数传递进来的每一个键参数,Python虚拟机都会判断它是一般的键参数还是扩展键参数,如果是扩展键参数,就将其插入到PyDictObject中。