Python 在并行进程之间共享字典

2022-01-12 00:00:00 python multiprocessing

问题描述

我想在我的进程之间共享一个字典,如下所示:

I want to share a dictionary between my processes as follows:

def f(y,x):
    y[x]=[x*x]                                                          

if __name__ == '__main__':
    pool = Pool(processes=4)
    inputs = range(10)
    y={}                             
    result = pool.map(f,y,inputs)

y 返回 {}.我怎样才能让它发挥作用?

The y returns {}. How can I make it work?

谢谢,


解决方案

这看起来你正在使用 multiprocessing 模块.你没有说,这是一个重要的信息.

This looks like you are using the multiprocessing module. You didn't say, and that's an important bit of information.

multiprocessing.Pool() 实例上的 .map() 函数有两个参数:一个函数和一个序列.将使用序列中的连续值调用该函数.

The .map() function on a multiprocessing.Pool() instance takes two arguments: a function, and a sequence. The function will be called with successive values from the sequence.

您不能在像 dict 这样的可变变量中收集值(在示例中,它是参数 y),因为您的代码将在多个不同的进程中运行.将值写入另一个进程中的 dict 不会将该值发送回原始进程.但是如果你使用 Pool.map() 其他进程将返回每个函数调用的结果,返回到第一个进程.然后你可以收集这些值来构建一个 dict.

You can't collect values in a mutable like a dict (in the example, it's argument y) because your code will be running in multiple different processes. Writing a value to a dict in another process doesn't send that value back to the original process. But if you use Pool.map() the other processes will return the result from each function call, back to the first process. Then you can collect the values to build a dict.

示例代码:

import multiprocessing as mp

def f(x):
    return (x, x*x)

if __name__ == '__main__':
    pool = mp.Pool()
    inputs = range(10)
    result = dict(pool.map(f, inputs))

result 设置为:{0: 0, 1: 1, 2: 4, 3: 9, 4: 16, 5: 25, 6: 36, 7: 49, 8:64, 9:81}

让我们改变它,而不是计算 x*x 它将提高 x 到某个幂,并且将提供该幂.让我们让它接受一个字符串键参数.这意味着 f() 需要接受一个元组参数,其中元组将是 (key, x, p) 并且它将计算 x**p.

Let's change it so that instead of computing x*x it will raise x to some power, and the power will be provided. And let's make it take a string key argument. This means that f() needs to take a tuple argument, where the tuple will be (key, x, p) and it will compute x**p.

import multiprocessing as mp

def f(tup):
    key, x, p = tup  # unpack tuple into variables
    return (key, x**p)

if __name__ == '__main__':
    pool = mp.Pool()
    inputs = range(10)
    inputs = [("1**1", 1, 1), ("2**2", 2, 2), ("2**3", 2, 3), ("3**3", 3, 3)]
    result = dict(pool.map(f, inputs))

如果您有多个序列,并且需要将它们连接在一起以形成上述的单个序列,请考虑使用 zip()itertools.product.

If you have several sequences and you need to join them together to make a single sequence for the above, look into using zip() or perhaps itertools.product.

相关文章