1. 评估复杂度
手动估算代码行数和计算复杂度。
迭代次数 * 循环中代码行数 * 一行的复杂度 O(n)
复杂度一般看级数, O(n), O(n^2)
2. 用ipython的timeit 算出平均运行时间
#uses ipynb
import time
strings = ['foo','foobar','bax','ssss','python'] * 10000
%timeit [x for x in strings if x[:3] == 'foo']
%timeit [x for x in strings if x.startswith('foo')]
#output
5.79 ms ± 316 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
9.7 ms ± 520 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
3. python3的Profile分析性能
- 分析某个函数性能
%prun add_and_sum(x,y)
%prun(cProfile)做宏观的性能分析,lprun做微观每一行的分析。
#uses ipython
In [1]: from numpy.random import randn
...:
...: def add_and_sum(x,y):
...: added = x + y
...: summed = added.sum(axis=1)
...: return summed
...:
...: def call_fun():
...: x = randn(1000,1000)
...: y = randn(1000,1000)
...: return add_and_sum(x,y)
...:
In [14]: %prun add_and_sum(x,y)
7 function calls in 0.075 seconds
Ordered by: internal time
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.052 0.052 0.067 0.067 <ipython-input-1-c92741b0d5ac>:3(a
dd_and_sum)
1 0.015 0.015 0.015 0.015 {method 'reduce' of 'numpy.ufunc'
objects}
1 0.008 0.008 0.075 0.075 <string>:1(<module>)
1 0.000 0.000 0.075 0.075 {built-in method builtins.exec}
1 0.000 0.000 0.015 0.015 {method 'sum' of 'numpy.ndarray' o
bjects}
1 0.000 0.000 0.015 0.015 _methods.py:31(_sum)
1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Prof
iler' objects}
- 常用语不想看那个模块的源码,简单测试/比较性能
# IN[1]
import cProfile
import re
cProfile.run('re.compile("foo|bar")')
#OUT[1]
5 function calls in 0.000 seconds
Ordered by: standard name
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.000 0.000 0.000 0.000 <string>:1(<module>)
1 0.000 0.000 0.000 0.000 re.py:231(compile)
1 0.000 0.000 0.000 0.000 re.py:286(_compile)
1 0.000 0.000 0.000 0.000 {built-in method builtins.exec}
1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects}
#test code Python3.6
import numpy as np
from numpy.linalg import eigvals
def run_experiment(niter = 100):
K =100
results = []
for _ in xrange(niter):
mat = np.random.rand(K,K)
max_eigenvalue = np.abs(eigvals(mat)).max()
return results
some_results = run_experiment()
print('largest one %s'% np.max(some_results))
4. 建议的代码格式
扁平结构比嵌套结构好。编写函数和类时注意低耦合和结构化,容易测试和调用。
参考
- 《利用python进行数据分析》
- 官网 27.4. The Python Profilers
- https://pymotw.com/3/profile/
2018.5.18 目前搞不出来的,一直到2018.7.26也并没有想法的问题...............以后想到再回来。
In [19]: %run -p -s cumulative n2.py
ERROR:root:Line magic function `%lprun` not found.
写作时间
2018.5.18 草稿
2018.7.26 V1.0