multiprocessing.Queue 项目的最大大小?

问题描述

我正在使用 Python 处理一个相当大的项目,该项目需要将计算密集型后台任务之一卸载到另一个核心,以便不会降低主要服务的速度.在使用 multiprocessing.Queue 来传达工作进程的结果时,我遇到了一些明显奇怪的行为.为 threading.Threadmultiprocessing.Process 使用相同的队列以进行比较,线程工作得很好,但在放入大项目后进程无法加入队列.观察:

I'm working on a fairly large project in Python that requires one of the compute-intensive background tasks to be offloaded to another core, so that the main service isn't slowed down. I've come across some apparently strange behaviour when using multiprocessing.Queue to communicate results from the worker process. Using the same queue for both a threading.Thread and a multiprocessing.Process for comparison purposes, the thread works just fine but the process fails to join after putting a large item in the queue. Observe:

import threading
import multiprocessing

class WorkerThread(threading.Thread):
    def __init__(self, queue, size):
        threading.Thread.__init__(self)
        self.queue = queue
        self.size = size

    def run(self):
        self.queue.put(range(size))


class WorkerProcess(multiprocessing.Process):
    def __init__(self, queue, size):
        multiprocessing.Process.__init__(self)
        self.queue = queue
        self.size = size

    def run(self):
        self.queue.put(range(size))


if __name__ == "__main__":
    size = 100000
    queue = multiprocessing.Queue()

    worker_t = WorkerThread(queue, size)
    worker_p = WorkerProcess(queue, size)

    worker_t.start()
    worker_t.join()
    print 'thread results length:', len(queue.get())

    worker_p.start()
    worker_p.join()
    print 'process results length:', len(queue.get())

我发现这对于 size = 10000 工作正常,但对于 size = 100000 会挂在 worker_p.join().multiprocessing.Process 实例可以放入 multiprocessing.Queue 的内容是否存在一些固有的大小限制?还是我在这里犯了一些明显的根本性错误?

I've seen that this works fine for size = 10000, but hangs at worker_p.join() for size = 100000. Is there some inherent size limit to what multiprocessing.Process instances can put in a multiprocessing.Queue? Or am I making some obvious, fundamental mistake here?

作为参考,我在 Ubuntu 10.04 上使用 Python 2.6.5.

For reference, I am using Python 2.6.5 on Ubuntu 10.04.


解决方案

似乎底层管道已满,因此在写入管道时,馈线线程阻塞(实际上是在尝试获取保护管道免受并发访问的锁时).

Seems the underlying pipe is full, so the feeder thread blocks on the write to the pipe (actually when trying to acquire the lock protecting the pipe from concurrent access).

检查这个问题http://bugs.python.org/issue8237

相关文章