主要有一下几个监控程序。
cProfile:可以打印每个函数的运行次数及运行总时间和平均时间。
line_profiler: 打印每一行代码的执行次数,执行总时间和执行平均时间。
memory_profiler: 打印每一行代占用内存的总量及内存的增量。(执行速度较慢)
tracemalloc: 打印每一行代码的内存总量,对象数量,平均大小。还可以对比前后运行的差异,及回溯内存的代码。
1.cProfile
Python自带了几个性能分析的模块:profile、cProfile和hotshot,使用方法基本都差不多,无非模块是纯Python还是用C写的。本文介绍cProfile。
import time
def func1():
sum = 0
for i in range(1000000):
sum += i
def func2():
time.sleep(10)
func1()
func2()
运行
python -m cProfile del.py
结果
6 function calls in 10.047 seconds
Ordered by: standard name
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.000 0.000 10.047 10.047 test.py:1(<module>)
1 0.000 0.000 10.000 10.000 test.py:10(func2)
1 0.047 0.047 0.047 0.047 test.py:4(func1)
1 0.000 0.000 10.047 10.047 {built-in method builtins.exec}
1 10.000 10.000 10.000 10.000 {built-in method time.sleep}
1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects}
其中对参数的解释:
ncalls:表示函数调用的次数;
tottime:表示指定函数的总的运行时间,除掉函数中调用子函数的运行时间;
percall:(第一个percall)等于 tottime/ncalls;
cumtime:表示该函数及其所有子函数的调用运行的时间,即函数开始调用到返回的时间;
percall:(第二个percall)即函数运行一次的平均时间,等于 cumtime/ncalls;
filename:lineno(function):每个函数调用的具体信息;
需要注意的是cProfile很难搞清楚函数内的每一行发生了什么,是针对整个函数来说的。
-s cumulative
-s cumulative开关告诉cProfile对每个函数累计花费的时间进行排序,他能让我看到代码最慢的部分。
2.line_profiler
主要作用就是记录每一行代码的执行时间及执行次数。
安装:
windows无法直接安装,可以再https://www.lfd.uci.edu/~gohlke/pythonlibs/找到对应的版本然后再安装。
line_profiler有两种调用方式:
一种是直接在代码里调用
from line_profiler import LineProfiler
import random
def do_stuff(numbers):
s = sum(numbers)
l = [numbers[i] / 43 for i in range(len(numbers))]
m = ['hello' + str(numbers[i]) for i in range(len(numbers))]
numbers = [random.randint(1, 100) for i in range(1000)]
lp = LineProfiler()
lp_wrapper = lp(do_stuff)
lp_wrapper(numbers)
lp.print_stats()
输出结果:
Timer unit: 1e-06 s
Total time: 0.000649 s
File: <ipython-input-2-2e060b054fea>
Function: do_stuff at line 4
Line # Hits Time Per Hit % Time Line Contents
==============================================================
4 def do_stuff(numbers):
5 1 10 10.0 1.5 s = sum(numbers)
6 1 186 186.0 28.7 l = [numbers[i]/43 for i in range(len(numbers))]
7 1 453 453.0 69.8 m = ['hello'+str(numbers[i]) for i in range(len(numbers))]
每一列的含义如下:
Line :文件中的行号。
Hits:该行被执行的次数。
Time:以计时器单位执行该行所花费的总时间。 在表格前的标题信息中,您将看到“计时器单位:”行,将转换因子指定为秒。 在不同的系统上可能有所不同。
Per Hit:以计时器为单位执行该行一次所花费的平均时间。
% Time:相对于该功能所记录的总时间,该行所花费的时间的百分比。
Line Contents:实际的源代码。 请注意,在查看格式化结果时,总是从磁盘读取此内容,而不是在执行代码时读取。 如果同时编辑了文件,则各行将不匹配,并且格式化程序甚至可能无法找到要显示的功能。
对于函数内部调用函数的情况:
from line_profiler import LineProfiler
import random
def do_other_stuff(numbers):
s = sum(numbers)
def do_stuff(numbers):
do_other_stuff(numbers)
l = [numbers[i]/43 for i in range(len(numbers))]
m = ['hello'+str(numbers[i]) for i in range(len(numbers))]
numbers = [random.randint(1,100) for i in range(1000)]
lp = LineProfiler()
lp_wrapper = lp(do_stuff)
lp_wrapper(numbers)
lp.print_stats()
这样做的话,只能显示子函数的总时间。输出如下:
Timer unit: 1e-06 s
Total time: 0.000773 s
File: <ipython-input-3-ec0394d0a501>
Function: do_stuff at line 7
Line # Hits Time Per Hit % Time Line Contents
==============================================================
7 def do_stuff(numbers):
8 1 11 11.0 1.4 do_other_stuff(numbers)
9 1 236 236.0 30.5 l = [numbers[i]/43 for i in range(len(numbers))]
10 1 526 526.0 68.0 m = ['hello'+str(numbers[i]) for i in range(len(numbers))]
为了能够同时显示函数每行所用时间和调用函数每行所用时间,加入add_function就能够解决。
from line_profiler import LineProfiler
import random
def do_other_stuff(numbers):
s = sum(numbers)
def do_stuff(numbers):
do_other_stuff(numbers)
l = [numbers[i]/43 for i in range(len(numbers))]
m = ['hello'+str(numbers[i]) for i in range(len(numbers))]
numbers = [random.randint(1,100) for i in range(1000)]
lp = LineProfiler()
lp.add_function(do_other_stuff) # add additional function to profile
lp_wrapper = lp(do_stuff)
lp_wrapper(numbers)
lp.print_stats()
Timer unit: 1e-06 s
Total time: 9e-06 s
File: <ipython-input-4-dae73707787c>
Function: do_other_stuff at line 4
Line # Hits Time Per Hit % Time Line Contents
==============================================================
4 def do_other_stuff(numbers):
5 1 9 9.0 100.0 s = sum(numbers)
Total time: 0.000694 s
File: <ipython-input-4-dae73707787c>
Function: do_stuff at line 7
Line # Hits Time Per Hit % Time Line Contents
==============================================================
7 def do_stuff(numbers):
8 1 12 12.0 1.7 do_other_stuff(numbers)
9 1 208 208.0 30.0 l = [numbers[i]/43 for i in range(len(numbers))]
10 1 474 474.0 68.3 m = ['hello'+str(numbers[i]) for i in range(len(numbers))]
也可以添加装饰器启动
import random
@profile
def do_other_stuff(numbers):
s = sum(numbers)
@profile
def do_stuff(numbers):
do_other_stuff(numbers)
l = [numbers[i] / 43 for i in range(len(numbers))]
m = ['hello' + str(numbers[i]) for i in range(len(numbers))]
numbers = [random.randint(1, 100) for i in range(1000)]
do_stuff(numbers)
不要担心,你不需要导入任何模组,直接使用命令启动即可
kernprof -l script_to_profile.py
或者
python -m line_profiler script_to_profile.py.lprof
运行结果如下
Timer unit: 2.84445e-07 s
Total time: 7.68e-06 s
File: test.py
Function: do_other_stuff at line 3
Line # Hits Time Per Hit % Time Line Contents
==============================================================
3 @profile
4 def do_other_stuff(numbers):
5 1 27.0 27.0 100.0 s = sum(numbers)
Total time: 0.000570311 s
File: test.py
Function: do_stuff at line 7
Line # Hits Time Per Hit % Time Line Contents
==============================================================
7 @profile
8 def do_stuff(numbers):
9 1 59.0 59.0 2.9 do_other_stuff(numbers)
10 1 639.0 639.0 31.9 l = [numbers[i] / 43 for i in range(len(numbers))]
11 1 1307.0 1307.0 65.2 m = ['hello' + str(numbers[i]) for i in range(len(numbers))]
3.memory_profiler
memory_profiler模块用来基于逐行测量代码的内存使用。使用这个模块会让代码运行的更慢。
安装方法如下:
pip install memory_profiler
另外,建议安装psutil包,这样memory_profile会运行的快一点:
pip install psutil
例子:
import time
@profile
def my_func():
a = [1] * (10 ** 6)
b = [2] * (2 * 10 ** 7)
time.sleep(10)
del b
del a
print("+++++++++")
if __name__ == '__main__':
my_func()
启动程序
python -m memory_profiler primes.py
输出结果:
+++++++++
Filename: test.py
Line # Mem usage Increment Line Contents
================================================
4 40.684 MiB 40.684 MiB @profile
5 def my_func():
6 48.316 MiB 7.633 MiB a = [1] * (10 ** 6)
7 200.906 MiB 152.590 MiB b = [2] * (2 * 10 ** 7)
8 200.914 MiB 0.008 MiB time.sleep(10)
9 48.324 MiB 0.000 MiB del b
10 40.691 MiB 0.000 MiB del a
11 40.699 MiB 0.008 MiB print("+++++++++")
Line:行号
Mem usage: 内存使用
Increment : 内存增量
Line Contents: 实际代码内容
第三方api
memory_profiler公开了第三方代码中要使用的许多功能。
memory_usage(proc = -1,interval = .1,timeout = None)
返回一个时间间隔内的内存使用情况。 第一个参数proc表示应监视的内容。
这可以是进程(不一定是Python程序)的PID,包含要评估的某些python代码的字符串,
也可以是包含要评估为f(* args)的函数及其参数的元组(f,args,kw) ,** kw)。 例如,
>>> from memory_profiler import memory_usage
>>> mem_usage = memory_usage(-1, interval=.2, timeout=1)
>>> print(mem_usage)
[7.296875, 7.296875, 7.296875, 7.296875, 7.296875]
在这里,我告诉memory_profiler获取当前进程在1秒内的内存消耗,时间间隔为0.2秒。 正如我给PID的-1那样,它是一个特殊的数字(PID通常为正),表示当前进程,也就是说,我正在获取当前Python解释器的内存使用情况。 因此,我从一个普通的python解释器中获得了大约7MB的内存使用量。 如果在IPython(控制台)上尝试相同的操作,我将获得29MB,如果在IPython笔记本上尝试相同的操作,则可扩展至44MB。
如果要获取Python函数的内存消耗,则应在元组(f,args,kw)中指定该函数及其参数。 例如:
>>> # define a simple function
>>> def f(a, n=100):
... import time
... time.sleep(2)
... b = [a] * n
... time.sleep(1)
... return b
...
>>> from memory_profiler import memory_usage
>>> memory_usage((f, (1,), {'n' : int(1e6)}))
这将执行代码f(1,n = int(1e6)),并在执行期间返回内存消耗。
4.tracemalloc
python3的标准库
import tracemalloc # from 3.4
tracemalloc.start() # 开始跟踪内存分配
d = [dict(zip('xy',(5, 6))) for i in range(1000000)]
t = [tuple(zip('xy',(5, 6))) for i in range(1000000)]
snapshot = tracemalloc.take_snapshot() # 快照,当前内存分配
top_stats = snapshot.statistics('lineno') # 快照对象的统计
for stat in top_stats:
print(stat)
结果
F:/lyh/git/test.py:5: size=237 MiB, count=2000002, average=124 B
F:/lyh/git/test.py:6: size=191 MiB, count=3000002, average=67 B
F:/lyh/git/test.py:8: size=448 B, count=1, average=448 B
C:\Users\yuanbao\AppData\Local\Programs\Python\Python36\lib\tracemalloc.py:522: size=64 B, count=1, average=64 B
每个文件名和每个 line 编号分配的 memory 块的统计信息:分配的 memory 块的总大小,数量和平均大小
计算两个快照之间的差异以检测 memory 泄漏
import tracemalloc # from 3.4
tracemalloc.start() # 开始跟踪内存分配
d = [dict(zip('xy',(5, 6))) for i in range(1000000)]
snapshot1 = tracemalloc.take_snapshot()
t = [tuple(zip('xy',(5, 6))) for i in range(1000000)]
# ... call the function leaking memory ...
snapshot2 = tracemalloc.take_snapshot()
top_stats = snapshot2.compare_to(snapshot1, 'lineno')
print("[ Top 10 differences ]")
for stat in top_stats[:10]:
print(stat)
结果
[ Top 10 differences ]
F:/lyh/git/test.py:7: size=191 MiB (+191 MiB), count=3000004 (+3000004), average=67 B
C:\Users\yuanbao\AppData\Local\Programs\Python\Python36\lib\tracemalloc.py:387: size=664 B (+664 B), count=5 (+5), average=133 B
C:\Users\yuanbao\AppData\Local\Programs\Python\Python36\lib\tracemalloc.py:524: size=568 B (+568 B), count=3 (+3), average=189 B
C:\Users\yuanbao\AppData\Local\Programs\Python\Python36\lib\tracemalloc.py:281: size=112 B (+112 B), count=2 (+2), average=56 B
C:\Users\yuanbao\AppData\Local\Programs\Python\Python36\lib\tracemalloc.py:522: size=128 B (+64 B), count=2 (+1), average=64 B
F:/lyh/git/test.py:5: size=237 MiB (+0 B), count=2000002 (+0), average=124 B
F:/lyh/git/test.py:6: size=448 B (+0 B), count=1 (+0), average=448 B
显示最大内存块的回溯的代码:
import tracemalloc # from 3.4
tracemalloc.start(25) # 开始跟踪内存分配
d = [dict(zip('xy',(5, 6))) for i in range(1000000)]
# snapshot1 = tracemalloc.take_snapshot()
t = [tuple(zip('xy',(5, 6))) for i in range(1000000)]
snapshot = tracemalloc.take_snapshot()
top_stats = snapshot.statistics('traceback')
# pick the biggest memory block
stat = top_stats[0]
print("%s memory blocks: %.1f KiB" % (stat.count, stat.size / 1024))
for line in stat.traceback.format():
print(line)
2000001 memory blocks: 242868.5 KiB
File "F:/lyh/git/test.py", line 5
d = [dict(zip('xy',(5, 6))) for i in range(1000000)]
File "F:/lyh/git/test.py", line 5
d = [dict(zip('xy',(5, 6))) for i in range(1000000)]
参考文章:https://blog.csdn.net/guofangxiandaihua/article/details/77825524
https://www.docs4dev.com/docs/zh/python/3.7.2rc1/all/library-tracemalloc.html