使用具有最大同时进程数的 multiprocessing.Process

2022-01-12 00:00:00 python multithreading multiprocessing

问题描述

我有 Python 代码:

from multiprocessing import Process

def f(name):
    print 'hello', name

if __name__ == '__main__':
    for i in range(0, MAX_PROCESSES):
        p = Process(target=f, args=(i,))
        p.start()

运行良好.但是,MAX_PROCESSES 是可变的,可以是 1512 之间的任何值.由于我只在具有 8 内核的机器上运行此代码,因此我需要确定是否可以限制允许同时运行的进程数.我查看了 multiprocessing.Queue,但它看起来不像我需要的 - 或者我可能错误地解释了文档.

which runs well. However, MAX_PROCESSES is variable and can be any value between 1 and 512. Since I'm only running this code on a machine with 8 cores, I need to find out if it is possible to limit the number of processes allowed to run at the same time. I've looked into multiprocessing.Queue, but it doesn't look like what I need - or perhaps I'm interpreting the docs incorrectly.

有没有办法限制同时运行的 multiprocessing.Process 的数量?

Is there a way to limit the number of simultaneous multiprocessing.Processs running?


解决方案

使用 multiprocessing.Pool 可能是最明智的,它根据可用的最大内核数生成工作进程池您的系统,然后基本上在内核可用时提供任务.

It might be most sensible to use multiprocessing.Pool which produces a pool of worker processes based on the max number of cores available on your system, and then basically feeds tasks in as the cores become available.

标准文档中的示例 (http://docs.python.org/2/library/multiprocessing.html#using-a-pool-of-workers)显示也可以手动设置核心数:

The example from the standard docs (http://docs.python.org/2/library/multiprocessing.html#using-a-pool-of-workers) shows that you can also manually set the number of cores:

from multiprocessing import Pool

def f(x):
    return x*x

if __name__ == '__main__':
    pool = Pool(processes=4)              # start 4 worker processes
    result = pool.apply_async(f, [10])    # evaluate "f(10)" asynchronously
    print result.get(timeout=1)           # prints "100" unless your computer is *very* slow
    print pool.map(f, range(10))          # prints "[0, 1, 4,..., 81]"

如果您的代码中需要,知道有 multiprocessing.cpu_count() 方法来计算给定系统上的内核数量也很方便.

And it's also handy to know that there is the multiprocessing.cpu_count() method to count the number of cores on a given system, if needed in your code.

这是一些似乎适用于您的特定情况的代码草案:

Here's some draft code that seems to work for your specific case:

import multiprocessing

def f(name):
    print 'hello', name

if __name__ == '__main__':
    pool = multiprocessing.Pool() #use all available cores, otherwise specify the number you want as an argument
    for i in xrange(0, 512):
        pool.apply_async(f, args=(i,))
    pool.close()
    pool.join()

相关文章