multiprocessing.Pool 似乎可以在 Windows 中工作,但不能在 ubuntu 中工作?
问题描述
已解决:问题出在 Wingware Python IDE 上.我想现在很自然的问题是这怎么可能以及如何解决.
SOLVED: The problem was Wingware Python IDE. I guess the natural question now is how it is possible and how this could be fixed.
我昨天问了一个问题(Python 中 multiprocessing.Pool 的问题) 这个问题几乎是一样的,但我发现它似乎可以在 Windows 计算机上运行,而不是在我的 ubuntu 中运行.在这篇文章的最后,我将发布一个略有不同的代码版本.
I asked a question yesterday ( Problem with multiprocessing.Pool in Python ) and this question is almost the same but I have figured out that it seems to work on a Windows computer and not in my ubuntu. At the end of this post I will post a slightly different version of the code that does the same thing.
我的问题的简短摘要:在 Python 中使用 multiprocessing.Pool 时,我并不总是能够获得我所要求的工人数量.发生这种情况时,程序就会停止.
Short summary of my problem: When using multiprocessing.Pool in Python I am not always able to get the amount of workers that I am asking for. When this happens, the program just stalls.
我整天都在寻找解决方案,然后我开始思考 Noahs 对我之前的问题的评论.他说它可以在他的机器上运行,所以我把代码提供给了我的同事,他运行的是一台带有 Enthoughts 64 位 Python 2.7.1 发行版的 Windows 机器.我与我在 ubuntu 上运行的巨大差异相同.我还提到我们都有 Wingware Python IDE,但我怀疑这是否重要?
I have been working for a solution all day, and then I came to think about Noahs' comment on my previous question. He said that it worked on his machine so I gave the code to my colleague who runs a Windows machine with Enthoughts 64-bit Python 2.7.1 distribution. I have the same with the big difference that mine runs on ubuntu. I also mention that we both have Wingware Python IDE, but I doubt that this is of any importance?
当我的同事在他的机器上运行代码时,我的代码不会出现两个问题.
There are two problems with my code that don't arise when my colleague runs the code on his machine.
我并不总是能够得到我要求的四个工人(尽管我的机器有 12 个工人).发生这种情况时,该过程会停止并且不会继续.没有引发异常或错误.
I am not always able to get the four workers I am asking for (Although my machine has 12 workers). When this happens, the process just stalls and does not continue. No exception or Error is raised.
当我能够得到我要求的四名工人时(大约 5 次左右出现 1 次),所产生的数字(普通随机数)对于所有四张图片都是完全相同的.我的同事不是这样.
When I am able to get the four workers I ask for (which happens approximately 1 out 5 times or so), the figures that are produced (plain random numbers) are EXACTLY the same for all four pictures. This is not the case for my colleague.
有些事情很可疑,我非常感谢你们提供的任何帮助.
Something is very fishy and I am very thankful for any kind of help you guys can offer.
代码:
import multiprocessing as mp
import scipy as sp
import scipy.stats as spstat
import pylab
def testfunc(x0, N):
print 'working with x0 = %s' % x0
x = [x0]
for i in xrange(1,N):
x.append(spstat.norm.rvs(size = 1)) # stupid appending to make it slower
if i % 10000 == 0:
print 'x0 = %s, i = %s' % (x0, i)
return sp.array(x)
def testfuncParallel(fargs):
return testfunc(*fargs)
# Define Number of tasks.
nTasks = 4
N = 100000
if __name__ == '__main__':
"""
Try number 1. Using multiprocessing.Pool together with Pool.map_async
"""
pool = mp.Pool(processes = nTasks) # I have 12 threads (six cores) available so I am suprised that it does not get access to nTasks = 4 amount of workers
# Define tasks:
tasks = [(x, n) for x, n in enumerate(nTasks*[N])] # nTasks different tasks
# Compute parallel: async - asynchronically, i.e. not necessary in order.
result = pool.map_async(testfuncParallel, tasks)
pool.close() # These are needed if map_async is used
pool.join()
# Get results:
sim = sp.zeros((N, nTasks))
for nn, res in enumerate(result.get()):
sim[:, nn] = res
pylab.figure()
for i in xrange(nTasks):
pylab.subplot(nTasks,1, i + 1)
pylab.plot(sim[:, i])
pylab.show()
提前致谢.
真诚地,马蒂亚斯
解决方案
更新: 原来这与 matplotlib 或后端无关,而是与一般多处理相关的错误有关.我们已经为 Wing 版本 4.0.4+ 修复了这个问题.解决方法是不在子进程中执行的代码中设置断点.
Update: Turns out this had nothing to do with matplotlib or the backends but rather with a bug associated with multiprocessing in general. We've fixed this for Wing version 4.0.4+. The work-around is not to set breakpoints in the code that is executed in the sub-processes.
这似乎是 Wing IDE 的 matplotlib 支持 Tkinter 后端与多处理交互不良.当我尝试这个示例时,它会在 TCL/Tk 代码中崩溃.我怀疑在 Windows 上工作的人正在使用不同的 matplotlib 后端.
It seems to be Wing IDE's matplotlib support for the Tkinter backend interacting badly with multiprocessing. When I try this example it crashes in TCL/Tk code. I suspect the person working on Windows was using a different matplotlib backend.
在扩展"选项卡下的项目属性"中关闭matplotlib 事件循环支持"似乎可以解决这个问题.
Turning off the "matplotlib event loop support" in Project Properties under the Extensions tab seems to work around it.
或者,当matplotlib 事件循环支持"打开时,添加以下内容似乎可以为我修复它.
Or, adding the following seems to fix it for me when the "matplotlib event loop support" is turned on.
导入matplotlibmatplotlib.use('WXAgg')
import matplotlib matplotlib.use('WXAgg')
这仅在您拥有 WXAgg 后端时才有效.Wing IDE 支持的其他后端(即使调试过程暂停,绘图也能保持交互)是 GTKAgg 和 Qt4Agg,但我还没有尝试过.
This will only work if you have the WXAgg backend. Other backends supported by Wing IDE (in such a way that plots remain interactive even if the debug process is paused) are GTKAgg and Qt4Agg but I didn't try those yet.
我会看看我是否能找到并修复这个错误.我怀疑我们需要在进程 ID 更改时禁用我们的事件循环支持.感谢您报告此事.
I'll see if I can find and fix the bug. I suspect we need to disable our event loop support when the process ID changes. Thanks for reporting this.
相关文章