Python多处理:为什么大块更慢?
问题描述
我一直在使用 Python 的多处理模块分析一些代码('job' 函数只是将数字平方).
I've been profiling some code using Python's multiprocessing module (the 'job' function just squares the number).
data = range(100000000)
n=4
time1 = time.time()
processes = multiprocessing.Pool(processes=n)
results_list = processes.map(func=job, iterable=data, chunksize=10000)
processes.close()
time2 = time.time()
print(time2-time1)
print(results_list[0:10])
我发现奇怪的一件事是,最佳块大小似乎是 10k 元素左右 - 这在我的计算机上花了 16 秒.如果我将块大小增加到 100k 或 200k,那么它会减慢到 20 秒.
One thing I found odd is that the optimal chunksize appears to be around 10k elements - this took 16 seconds on my computer. If I increase the chunksize to 100k or 200k, then it slows to 20 seconds.
这种差异可能是由于更长的列表需要更长的酸洗时间吗?100 个元素的块大小需要 62 秒,我假设这是由于在不同进程之间来回传递块所需的额外时间.
Could this difference be due to the amount of time required for pickling being longer for longer lists? A chunksize of 100 elements takes 62 seconds which I'm assuming is due to the extra time required to pass the chunks back and forth between different processes.
解决方案
关于最优chunksize:
About optimal chunksize:
- 拥有大量的小块将允许 4 个不同的工作人员更有效地分配负载,因此更小的块将是可取的.
- 另一方面,每次必须处理新块时,与进程相关的上下文更改都会增加开销,因此需要更少的上下文更改,因此需要更少的块.
由于两个规则都需要不同的方法,所以中间的点是要走的路,类似于供需图.
As both rules want different aproaches, a point in the middle is the way to go, similar to a supply-demand chart.
相关文章