Pool
如果要启动大量的子进程,可以用进程池的方式批量创建子进程:
from multiprocessing import Pool
import os,time,random
def long_time_task(name):
print('Run task {0}{1}...'.format(name,os.getpid()))
start =time.time()
time.sleep(random.random()*3)
end = time.time()
print('Task {0} runs {1} seconds'.format(name,(end-start)))
if __name__ == '__main__':
print('Parent process {0}'.format(os.getpid()))
p=Pool()
for i in range(5):
p.apply_async(long_time_task,args=(i,))
print('Waiting for all subprocesses done')
p.close()
p.join()
print('aLL subprocesses done')
运行结果:
Parent process 752
Waiting for all subprocesses done
Run task 06220...
Run task 117188...
Run task 27792...
Run task 33184...
Task 1 runs 0.5721392631530762 seconds
Run task 417188...
Task 3 runs 1.0217230319976807 seconds
Task 0 runs 1.7482101917266846 seconds
Task 2 runs 2.8151016235351562 seconds
Task 4 runs 2.6533689498901367 seconds
aLL subprocesses done
对Pool对象调用join()
方法会等待所有子进程执行完毕,调用join()之前必须先调用close()
,调用close()
之后就不能继续添加新的Process了。
pool.apply_async
非阻塞,定义的进程池最大数的同时执行
pool.apply
一个进程结束,释放回进程池,开始下一个进程
请注意输出的结果,task0
,1
,2
,3
是立刻执行的,而task 4要等待前面某个task完成后才执行,这是因为Pool的默认大小在我的电脑上是4
,因此,最多同时执行4个进程。这是Pool有意设计的限制,并不是操作系统的限制
如果改成:
p=Pool(5)
就可以同时跑5个进程。
运行结果:
Parent process 1916
Waiting for all subprocesses done
Run task 015420...
Run task 14056...
Run task 211628...
Run task 36832...
Run task 47920...
Task 2 runs 0.14444756507873535 seconds
Task 4 runs 0.3404884338378906 seconds
Task 3 runs 0.8614604473114014 seconds
Task 1 runs 1.793064832687378 seconds
Task 0 runs 2.1561684608459473 seconds
aLL subprocesses done
多线程
多任务可以由多进程完成,也可以由一个进程内的多线程完成。
我们前面提到了进程是由若干线程组成的,一个进程至少有一个线程。
引入线程模块 threading
import threading
def worker(args):
print("开始子进程 {0}".format(args))
print("结束子进程 {0}".format(args))
if __name__ == '__main__':
print("start main")
t1 = threading.Thread(target=worker, args=(1,))
#创建线程 参数必须为元组 ,只有一个参数时写出(1,)
t2 = threading.Thread(target=worker, args=(2,))
t1.start()
t2.start()
print("end main")
运行结果:
start main
开始子进程 1
结束子进程 1
end main
开始子进程 2
结束子进程 2
import time,threading
def loop():
print('thread {0} is running...'.format(threading.current_thread().name))
n = 0
while n<5:
n = n+1
print('thread {0} >>> {1}'.format(threading.current_thread().name,n))
time.sleep(1)
print('thread {0} ended....'.format(threading.current_thread().name))
print('thread {0} is running....'.format(threading.current_thread().name))
t = threading.Thread(target=loop,name='LoopThread')
t.start()
t.join()
print('thread {0} ended'.format(threading.current_thread().name)
运行结果:
thread MainThread is running....
thread LoopThread is running...
thread LoopThread >>> 1
thread LoopThread >>> 2
thread LoopThread >>> 3
thread LoopThread >>> 4
thread LoopThread >>> 5
thread LoopThread ended....
thread MainThread ended
由于任何进程默认就会启动一个线程,我们把该线程称为主线程,主线程又可以启动新的线程,Python的threading模块有个current_thread()函数,它永远返回当前线程的实例。主线程实例的名字叫MainThread,子线程的名字在创建时指定,我们用LoopThread命名子线程。名字仅仅在打印时用来显示,完全没有其他意义,如果不起名字Python就自动给线程命名为Thread-1,Thread-2……