我想运行15个命令,但一次只能运行3个
测试文件
import multiprocessing
import time
import random
import subprocess
def popen_wrapper(i):
p = subprocess.Popen( ['echo', 'hi'], stdout=subprocess.PIPE, stderr=subprocess.PIPE)
stdout, stderr = p.communicate()
print stdout
time.sleep(randomint(5,20)) #pretend it's doing some work
return p.returncode
num_to_run = 15
max_parallel = 3
running = []
for i in range(num_to_run):
p = multiprocessing.Process(target=popen_wrapper, args=(i,))
running.append(p)
p.start()
if len(running) >= max_parallel:
# blocking wait - join on whoever finishes first then continue
else:
# nonblocking wait- see if any processes is finished. If so, join the finished processes
我不确定如何执行以下评论:
if len(running) >= max_parallel:
# blocking wait - join on whoever finishes first then continue
else:
# nonblocking wait- see if any processes is finished. If so, join the finished processes
我将无法执行以下操作:
for p in running:
p.join()
因为运行中的第二个进程已经完成,但是我仍然在第一个进程上受阻.
问题:如何检查阻塞和非阻塞状态下正在运行的进程是否都已完成(找到第一个已完成的进程)?
寻找类似等待的东西,也许
解决方法:
也许最简单的方法是使用multiprocessing.Pool:
pool = mp.Pool(3)
将建立一个包含3个工作进程的池.然后,您可以将15个任务发送到池:
for i in range(num_to_run):
pool.apply_async(popen_wrapper, args=(i,), callback=log_result)
协调3名工人和15项任务的所有必要机械是
由mp.Pool照顾.
使用mp.Pool:
import multiprocessing as mp
import time
import random
import subprocess
import logging
logger = mp.log_to_stderr(logging.WARN)
def popen_wrapper(i):
logger.warn('echo "hi"')
return i
def log_result(retval):
results.append(retval)
if __name__ == '__main__':
num_to_run = 15
max_parallel = 3
results = []
pool = mp.Pool(max_parallel)
for i in range(num_to_run):
pool.apply_async(popen_wrapper, args=(i,), callback=log_result)
pool.close()
pool.join()
logger.warn(results)
产量
[WARNING/PoolWorker-1] echo "hi"
[WARNING/PoolWorker-3] echo "hi"
[WARNING/PoolWorker-1] echo "hi"
[WARNING/PoolWorker-1] echo "hi"
[WARNING/PoolWorker-3] echo "hi"
[WARNING/PoolWorker-1] echo "hi"
[WARNING/PoolWorker-3] echo "hi"
[WARNING/PoolWorker-1] echo "hi"
[WARNING/PoolWorker-3] echo "hi"
[WARNING/PoolWorker-1] echo "hi"
[WARNING/PoolWorker-3] echo "hi"
[WARNING/PoolWorker-1] echo "hi"
[WARNING/PoolWorker-1] echo "hi"
[WARNING/PoolWorker-3] echo "hi"
[WARNING/PoolWorker-2] echo "hi"
[WARNING/MainProcess] [0, 2, 3, 5, 4, 6, 7, 8, 9, 10, 11, 12, 14, 13, 1]
日志记录语句显示哪个PoolWorker处理每个任务,最后一条日志语句显示MainProcess已从对popen_wrapper的15次调用中接收了返回值.
如果要在没有池的情况下执行此操作,则可以为任务设置mp.Queue,为返回值设置mp.Queue:
使用mp.Process和mp.Queues:
import multiprocessing as mp
import time
import random
import subprocess
import logging
logger = mp.log_to_stderr(logging.WARN)
SENTINEL = None
def popen_wrapper(inqueue, outqueue):
for i in iter(inqueue.get, SENTINEL):
logger.warn('echo "hi"')
outqueue.put(i)
if __name__ == '__main__':
num_to_run = 15
max_parallel = 3
inqueue = mp.Queue()
outqueue = mp.Queue()
procs = [mp.Process(target=popen_wrapper, args=(inqueue, outqueue))
for i in range(max_parallel)]
for p in procs:
p.start()
for i in range(num_to_run):
inqueue.put(i)
for i in range(max_parallel):
# Put sentinels in the queue to tell `popen_wrapper` to quit
inqueue.put(SENTINEL)
for p in procs:
p.join()
results = [outqueue.get() for i in range(num_to_run)]
logger.warn(results)
请注意,如果您使用
procs = [mp.Process(target=popen_wrapper, args=(inqueue, outqueue))
for i in range(max_parallel)]
然后您强制执行完全为max_parallel(例如3个)工作进程.然后,您将所有15个任务发送到一个队列:
for i in range(num_to_run):
inqueue.put(i)
并让工作进程将任务从队列中拉出:
def popen_wrapper(inqueue, outqueue):
for i in iter(inqueue.get, SENTINEL):
logger.warn('echo "hi"')
outqueue.put(i)
您可能还会发现感兴趣的Doug Hellman’s multiprocessing tutorial.在众多说明性示例中,您将找到an ActivePool
recipe,该示例显示了如何生成10个进程并限制它们(使用mp.Semaphore),以便在任何给定时间只有3个处于活动状态.虽然这可能是有启发性的,但它可能不是您所处情况的最佳解决方案,因为似乎没有理由要生成三个以上的进程.