对每个进程使用具有不同随机种子的 python 多处理

2022-01-12 00:00:00 python multiprocessing

问题描述

我希望并行运行多个模拟实例,但每个模拟都有自己独立的数据集.

I wish to run several instances of a simulation in parallel, but with each simulation having its own independent data set.

目前我实现如下:

P = mp.Pool(ncpus) # Generate pool of workers
for j in range(nrun): # Generate processes
    sim = MDF.Simulation(tstep, temp, time, writeout, boundaryxy, boundaryz, relax, insert, lat,savetemp)
    lattice = MDF.Lattice(tstep, temp, time, writeout, boundaryxy, boundaryz, relax, insert, lat, kb, ks, kbs, a, p, q, massL, randinit, initvel, parangle,scaletemp,savetemp)
    adatom1 = MDF.Adatom(tstep, temp, time, writeout, boundaryxy, boundaryz, relax, insert, lat, ra, massa, amorse, bmorse, r0, z0, name, lattice, samplerate,savetemp)        
    P.apply_async(run,(j,sim,lattice,adatom1),callback=After) # run simulation and ISF analysis in each process
P.close()
P.join() # start processes  

其中 simadatom1lattice 是传递给启动模拟的函数 run 的对象.

where sim, adatom1 and lattice are objects passed to the function run which initiates the simulation.

但是,我最近发现,我同时运行的每个批次(即,每个 ncpus 都用完模拟运行的总 nrun 次)给出完全相同的结果.

However, I recently found out that each batch I run simultaneously (that is, each ncpus runs out of the total nrun of simulations runs) gives the exact same results.

这里有人可以指导如何解决这个问题吗?

Can someone here enlighten how to fix this?


解决方案

只是想我会添加一个实际答案以使其他人清楚.

Just thought I would add an actual answer to make it clear for others.

引用 aix 的答案in this问题:

Quoting the answer from aix in this question:

发生的情况是,在 Unix 上,每个工作进程都继承相同的来自父进程的随机数生成器的状态.这是为什么它们会生成相同的伪随机序列.

What happens is that on Unix every worker process inherits the same state of the random number generator from the parent process. This is why they generate identical pseudo-random sequences.

使用 random.seed() 方法(或 scipy/numpy 等价物)正确设置种子.另请参阅这个 numpy 线程.

Use the random.seed() method (or the scipy/numpy equivalent) to set the seed properly. See also this numpy thread.

相关文章