Python 多处理模块的 .join() 方法到底在做什么?
问题描述
了解 Python 多处理(来自 PMOTW 文章) 并且希望对 join()
方法的具体作用进行一些说明.
Learning about Python Multiprocessing (from a PMOTW article) and would love some clarification on what exactly the join()
method is doing.
在 2008 年的旧教程 中指出如果没有下面代码中的 p.join()
调用,子进程将处于空闲状态并且不会终止,成为必须手动杀死的僵尸".
In an old tutorial from 2008 it states that without the p.join()
call in the code below, "the child process will sit idle and not terminate, becoming a zombie you must manually kill".
from multiprocessing import Process
def say_hello(name='world'):
print "Hello, %s" % name
p = Process(target=say_hello)
p.start()
p.join()
我添加了 PID
和 time.sleep
的打印输出来测试,据我所知,进程自行终止:
I added a printout of the PID
as well as a time.sleep
to test and as far as I can tell, the process terminates on its own:
from multiprocessing import Process
import sys
import time
def say_hello(name='world'):
print "Hello, %s" % name
print 'Starting:', p.name, p.pid
sys.stdout.flush()
print 'Exiting :', p.name, p.pid
sys.stdout.flush()
time.sleep(20)
p = Process(target=say_hello)
p.start()
# no p.join()
20 秒内:
936 ttys000 0:00.05 /Library/Frameworks/Python.framework/Versions/2.7/Reso
938 ttys000 0:00.00 /Library/Frameworks/Python.framework/Versions/2.7/Reso
947 ttys001 0:00.13 -bash
20 秒后:
947 ttys001 0:00.13 -bash
行为与在文件末尾添加的 p.join()
相同.本周 Python 模块提供了非常易读的模块说明;要等到进程完成其工作并退出,请使用 join() 方法.",但似乎至少 OS X 无论如何都在这样做.
Behavior is the same with p.join()
added back at end of the file. Python Module of the Week offers a very readable explanation of the module; "To wait until a process has completed its work and exited, use the join() method.", but it seems like at least OS X was doing that anyway.
我也想知道方法的名称..join()
方法是否在此处连接任何内容?它是否将一个过程与它的结束连接起来?或者它只是与 Python 的原生 .join()
方法共享一个名称?
Am also wondering about the name of the method. Is the .join()
method concatenating anything here? Is it concatenating a process with it's end? Or does it just share a name with Python's native .join()
method?
解决方案
join()
方法,当与 threading
或 multiprocessing
一起使用时, 与 str.join()
无关 - 它实际上并没有将任何东西连接在一起.相反,它只是意味着等待这个[线程/进程]完成".使用名称 join
是因为 multiprocessing
模块的 API 看起来类似于 threading
模块的 API,而 threading
模块使用 join
作为它的 Thread
对象.使用术语 join
来表示等待线程完成"在许多编程语言中都很常见,因此 Python 也采用了它.
The join()
method, when used with threading
or multiprocessing
, is not related to str.join()
- it's not actually concatenating anything together. Rather, it just means "wait for this [thread/process] to complete". The name join
is used because the multiprocessing
module's API is meant to look as similar to the threading
module's API, and the threading
module uses join
for its Thread
object. Using the term join
to mean "wait for a thread to complete" is common across many programming languages, so Python just adopted it as well.
现在,无论是否调用 join()
,您都会看到 20 秒延迟的原因是因为默认情况下,当主进程准备退出时,它会隐式调用 join()
在所有正在运行的 multiprocessing.Process
实例上.这在 multiprocessing
文档中没有明确说明,但在 编程指南部分:
Now, the reason you see the 20 second delay both with and without the call to join()
is because by default, when the main process is ready to exit, it will implicitly call join()
on all running multiprocessing.Process
instances. This isn't as clearly stated in the multiprocessing
docs as it should be, but it is mentioned in the Programming Guidelines section:
还请记住,非守护进程将自动加入.
Remember also that non-daemonic processes will be automatically be joined.
您可以通过在启动进程之前将 Process
上的 daemon
标志设置为 True
来覆盖此行为:
You can override this behavior by setting the daemon
flag on the Process
to True
prior to starting the process:
p = Process(target=say_hello)
p.daemon = True
p.start()
# Both parent and child will exit here, since the main process has completed.
如果你这样做,子进程 将在主进程后立即终止完成:
If you do that, the child process will be terminated as soon as the main process completes:
守护进程
进程的守护进程标志,一个布尔值.这必须在之前设置start() 被调用.
The process’s daemon flag, a Boolean value. This must be set before start() is called.
初始值继承自创建过程.
The initial value is inherited from the creating process.
当一个进程退出时,它会尝试终止它的所有守护进程子进程.
相关文章