python多处理参数:深拷贝?

2022-01-12 00:00:00 python multiprocessing deep-copy

问题描述

from multiprocessing import Process
# c is a container
p = Process(target = f, args = (c,))
p.start()

我假设 c 的深拷贝被传递给函数 f 因为浅拷贝在新进程的情况下没有意义(新进程不可以访问来自调用进程的数据).

I assume a deep copy of c is passed to function f because shallow copy would make no sense in the case of a new process (the new process doesn't have access to the data from the calling process).

但是这个深拷贝是如何定义的呢?copy 中有一整套 注释集.deepcopy() 文档,所有这些注释是否也适用于这里?multiprocessing 文档什么也没说...

But how is this deep copy defined? There is a whole set of notes in the copy.deepcopy() documentation, do all these notes apply here as well? The multiprocessing documentation says nothing...


解决方案

当你创建一个 Process 实例时,Python 会在底层发出一个 fork().这会创建一个子进程,其内存空间是其父进程的精确副本——因此在 fork 时存在的所有内容都会被复制.

When you create a Process instance, under the hood Python issues a fork(). This creates a child process whose memory space is an exact copy of its parent -- so everything existing at the time of the fork is copied.

在 Linux 上,这是通过写时复制"来提高效率的.来自 fork 手册页:

On Linux this is made efficient through "copy-on-write". From the fork man page:

fork() 创建一个子进程仅与父进程不同在它的 PID 和 PPID 中,事实上资源利用率设置为0. 文件锁和挂起信号不被继承.

fork() creates a child process that differs from the parent process only in its PID and PPID, and in the fact that resource utilizations are set to 0. File locks and pending signals are not inherited.

Linux下实现fork()使用写时复制页面,所以唯一的它招致的惩罚是时间和复制所需的内存父的页表,并创建一个孩子的独特任务结构.

Under Linux, fork() is implemented using copy-on-write pages, so the only penalty that it incurs is the time and memory required to duplicate the parent's page tables, and to create a unique task structure for the child.

相关文章