Python multiprocessing.Queue 上的 put 和 get 死锁

问题描述

这段代码遇到了死锁问题:

I'm having deadlock problems with this piece of code:


def _entropy_split_parallel(data_train, answers_train, weights):
    CPUS = 1 #multiprocessing.cpu_count()
    NUMBER_TASKS = len(data_train[0])
    processes = []

    multi_list = zip(data_train, answers_train, weights)

    task_queue = multiprocessing.Queue()
    done_queue = multiprocessing.Queue()

    for feature_index in xrange(NUMBER_TASKS):
        task_queue.put(feature_index)

    for i in xrange(CPUS):
        process = multiprocessing.Process(target=_worker, 
                args=(multi_list, task_queue, done_queue))
        processes.append(process)
        process.start()

    min_entropy = None
    best_feature = None
    best_split = None
    for i in xrange(NUMBER_TASKS):
        entropy, feature, split = done_queue.get()
        if (entropy < min_entropy or min_entropy == None) and entropy != None:
            best_feature = feature
            best_split = split

    for i in xrange(CPUS):
        task_queue.put('STOP')

    for process in processes:
        process.join()

    return best_feature, best_split


def _worker(multi_list, task_queue, done_queue):
    feature_index = task_queue.get()
    while feature_index != 'STOP':
        result = _entropy_split3(multi_list, feature_index)
        done_queue.put(result)
        feature_index = task_queue.get()

当我运行我的程序时,它可以通过 _entropy_split_parallel 运行几次,但最终会死锁.父进程在 done_queue.get() 上阻塞,工作进程在 done_queue.put() 上阻塞.由于发生这种情况时队列始终为空,因此预计会阻塞 get.我不明白为什么工作人员会阻塞 put,因为队列显然没有满(它是空的!).我试过 blocktimeout 关键字参数,但得到相同的结果.

When I run my program, it works fine for several runs through _entropy_split_parallel, but eventually deadlocks. The parent process is blocking on done_queue.get(), and the worker process is blocking on done_queue.put(). Since the queue is always empty when this happens, blocking on get is expected. What I don't understand is why the worker is blocking on put, since the queue is obviously not full (it's empty!). I've tried the block and timeout keyword arguments, but get the same result.

我正在使用多处理反向端口,因为我坚持使用 Python 2.5.

I'm using the multiprocessing backport, since I'm stuck with Python 2.5.

看起来我也遇到了多处理模块提供的示例之一的死锁问题.这是 here. 底部的第三个示例.如果我调用多次测试方法.例如,将脚本底部更改为:

It looks like I'm also getting deadlock issues with one of the examples provided with the multiprocessing module. It's the third example from the bottom here. The deadlocking only seems to occur if I call the test method many times. For example, changing the bottom of the script to this:


if __name__ == '__main__':
    freeze_support()
    for x in xrange(1000):
        test()

<小时>

我知道这是一个老问题,但测试表明这在使用 Python 2.7 的 Windows 上不再是问题.我将尝试 Linux 并报告.


I know this is an old question, but testing shows that this is no longer a problem on windows with Python 2.7. I will try Linux and report back.


解决方案

这个问题在 Python 的新版本中消失了,所以我假设它是 backport 的问题.无论如何,这不再是问题.

This problem went away with newer versions of Python, so I'm assuming it was a problem with the backport. Anyways, it's no longer an issue.

相关文章