Python 多处理 Numpy 随机

2022-01-12 00:00:00 python numpy multiprocessing

问题描述

在多处理调用的函数中,numpy ndarray 函数的范围是否不同?这是一个例子:

Does the scope of a numpy ndarray function differently within a function called by multiprocessing? Here is an example:

使用 python 的多处理模块我正在调用这样的函数:

Using python's multiprocessing module I am calling a function like so:

for core in range(cores):
    #target could be f() or g()
    proc = mp.Process(target=f, args=(core))
    jobs.append(proc)
for job in jobs:
    job.start()
for job in jobs:
    job.join()

def f(core):
    x = 0
    x += random.randint(0,10)
    print x

def g(core):
    #Assume an array with 4 columns and n rows
    local = np.copy(globalshared_array[:,core])
    shuffled = np.random.permutation(local)

调用f(core)x变量是进程本地的,即.它按预期打印一个不同的随机整数.这些从不超过 10,表明 x=0 在每个进程中.对吗?

Calling f(core), the x variable is local to the process, ie. it prints a different, random integer as expected. These never exceed 10, indicating that x=0 in each process. Is that correct?

调用 g(core) 并排列数组的副本会返回 4 个完全相同的混洗"数组.这似乎表明工作副本不是子进程的本地.那是对的吗?如果是这样,除了使用共享内存空间之外,当需要从共享内存空间填充时,是否可以将 ndarray 放在子进程的本地?

Calling g(core) and permuting a copy of the array returns 4 identically 'shuffled' arrays. This seems to indicate that the working copy is not local the child process. Is that correct? If so, other than using sharedmemory space, is it possible to have an ndarray be local to the child process when it needs to be filled from shared memory space?

更改 g(core) 以添加随机整数似乎具有预期的效果.数组显示不同的值.permutation 中一定发生了一些事情,随机排列列(每个子进程的本地)相同...想法?

Altering g(core) to add a random integer appears to have the desired effect. The array's show a different value. Something must be occurring in permutation that is randomly ordering the columns (local to each child process) the same...ideas?

def g(core):
    #Assume an array with 4 columns and n rows
    local = np.copy(globalshared_array[:,core])
    local += random.randint(0,10)

编辑二:np.random.shuffle 也表现出相同的行为.数组的内容正在洗牌,但在每个核心上都洗牌到相同的值.

EDIT II: np.random.shuffle also exhibits the same behavior. The contents of the array are shuffling, but are shuffling to the same value on each core.


解决方案

调用 g(core) 并排列数组的副本会返回 4 个完全相同的混洗"数组.这似乎表明工作副本不是子进程的本地.

Calling g(core) and permuting a copy of the array returns 4 identically 'shuffled' arrays. This seems to indicate that the working copy is not local the child process.

这可能表明随机数生成器在每个子进程中的初始化相同,产生相同的序列.您需要为每个孩子的生成器播种(也许将孩子的进程 id 混入其中).

What it likely indicates is that the random number generator is initialized identically in each child process, producing the same sequence. You need to seed each child's generator (perhaps throwing the child's process id into the mix).

相关文章