最好的解决方案是利用Pool
。使用Queue
s并具有单独的“队列馈送”功能可能会过大。
这是程序的稍微重新排列的版本,这次只有2个进程包含在Pool
。我相信这是最简单的方法,只需对原始代码进行最少的更改即可:
import multiprocessing
import time
data = (
['a', '2'], ['b', '4'], ['c', '6'], ['d', '8'],
['e', '1'], ['f', '3'], ['g', '5'], ['h', '7']
)
def mp_worker((inputs, the_time)):
print " Processs %s\tWaiting %s seconds" % (inputs, the_time)
time.sleep(int(the_time))
print " Process %s\tDONE" % inputs
def mp_handler():
p = multiprocessing.Pool(2)
p.map(mp_worker, data)
if __name__ == '__main__':
mp_handler()
请注意,该mp_worker()
函数现在接受一个参数(前两个参数的一个元组),因为该map()
函数将您的输入数据分块为子列表,每个子列表都作为一个单独的参数提供给worker函数。
输出:
Processs a Waiting 2 seconds
Processs b Waiting 4 seconds
Process a DONE
Processs c Waiting 6 seconds
Process b DONE
Processs d Waiting 8 seconds
Process c DONE
Processs e Waiting 1 seconds
Process e DONE
Processs f Waiting 3 seconds
Process d DONE
Processs g Waiting 5 seconds
Process f DONE
Processs h Waiting 7 seconds
Process g DONE
Process h DONE
根据下面的@Thales评论进行编辑:
如果要“为每个池限制加锁”,以便您的进程成对运行,请执行以下操作:
A等待B等待| A完成,B完成| C等待D等待| C完成,D完成| ...
然后更改处理程序功能以为每对数据启动池(由2个进程组成):
def mp_handler():
subdata = zip(data[0::2], data[1::2])
for task1, task2 in subdata:
p = multiprocessing.Pool(2)
p.map(mp_worker, (task1, task2))
现在您的输出是:
Processs a Waiting 2 seconds
Processs b Waiting 4 seconds
Process a DONE
Process b DONE
Processs c Waiting 6 seconds
Processs d Waiting 8 seconds
Process c DONE
Process d DONE
Processs e Waiting 1 seconds
Processs f Waiting 3 seconds
Process e DONE
Process f DONE
Processs g Waiting 5 seconds
Processs h Waiting 7 seconds
Process g DONE
Process h DONE