类方法的并行执行

2022-01-12 00:00:00 python multiprocessing

问题描述

我需要并行执行同一类的许多实例的方法.为此,我尝试使用 Process.start()Process.join() 命令来自 multiprocessing 模块.

I need to execute in parallel a method of many instances of the same class. For doing this I'm trying to use the Process.start() and the Process.join() commands from the multiprocessing module.

例如对于一个类:

class test:
     def __init__(self):
     ...
     ...
     def method(self):
     ...
     ...

where method 修改了一些类变量.如果我创建该类的两个实例:

where method modifies some of the class variables. If I make two instances of the class:

t1=test()
t2=test()

并执行:

from multiprocessing import Process
pr1=Process(target=t1.method, args=(,))
pr2=Process(target=t2.method, args=(,))
pr1.start()
pr2.start()
pr1.join()
pr2.join()

类实例的变量没有更新(整个代码太长,这里贴不上来,不过是这个思路).

the variables of the instances of the class are not updated (the whole code is too long to be pasted here but this is the idea).

有什么方法可以实现吗?谢谢

Is there any way to achieve this? Thank you


解决方案

当您在子进程中调用 obj.method 时,子进程将在 <代码>对象.因此,您在子项中对它们所做的更改不会反映在父项中.您需要通过 multiprocessing.Queue 以使更改生效父:

When you call obj.method in a child process, the child process is getting its own separate copy of each instance variable in obj. So, the changes you make to them in the child will not be reflected in the parent. You'll need to explicitly pass the changed values back to the parent via a multiprocessing.Queue in order to make the changes take effect the parent:

from multiprocessing import Process, Queue
q1 = Queue()
q2 = Queue()
pr1 = Process(target=t1.method, args=(q1,))
pr2 = Process(target=t2.method, args=(q2,))
pr1.start()
pr2.start()
out1 = q1.get()
out2 = q2.get()
t1.blah = out1
t2.blah = out2
pr1.join()
pr2.join()

其他选项是使您需要更改的实例变量 multiprocessing.Value 实例,或 multiprocessing.Manager Proxy 实例.这样,您在子项中所做的更改会自动反映在父项中.但这是以增加使用父级变量的开销为代价的.

Other options would be to make the instance variables you need to change multiprocessing.Value instances, or multiprocessing.Manager Proxy instances. That way, the changes you make in the children would be reflected in the parent automatically. But that comes at the cost of adding overhead to using the variables in the parent.

这是一个使用 multiprocessing.Manager 的示例.这不起作用:

Here's an example using multiprocessing.Manager. This doesn't work:

import multiprocessing

class Test(object) :

    def __init__(self):
       self.some_list = []  # Normal list

    def method(self):
        self.some_list.append(123)  # This change gets lost


if __name__ == "__main__":
    t1 = Test()
    t2 = Test()
    pr1 = multiprocessing.Process(target=t1.method)
    pr2 = multiprocessing.Process(target=t2.method)
    pr1.start()
    pr2.start()
    pr1.join()
    pr2.join()
    print(t1.some_list)
    print(t2.some_list)

输出:

[]
[]

这行得通:

import multiprocessing

class Test(object) :

    def __init__(self):
       self.manager = multiprocessing.Manager()
       self.some_list = self.manager.list()  # Shared Proxy to a list

    def method(self):
        self.some_list.append(123) # This change won't be lost


if __name__ == "__main__":
    t1 = Test()
    t2 = Test()
    pr1 = multiprocessing.Process(target=t1.method)
    pr2 = multiprocessing.Process(target=t2.method)
    pr1.start()
    pr2.start()
    pr1.join()
    pr2.join()
    print(t1.some_list)
    print(t2.some_list)

输出:

[123]
[123]

请记住,multiprocessing.Manager 会启动一个子进程来管理您创建的所有共享实例,并且每次您访问其中一个 Proxy 实例时,您实际上是在对 Manager 进程进行 IPC 调用.

Just keep in mind that a multiprocessing.Manager starts a child process to manage all the shared instances you create, and that every time you access one of the Proxy instances, you're actually making an IPC call to the Manager process.

相关文章