Python:非阻塞+非失效进程

2022-01-18 00:00:00 python subprocess

问题描述

我想创建一个父进程,它会创建许多子进程.由于父进程负责创建子进程,因此父进程不会关心子进程的状态.

I would like to create a parent process, which will create many child process. Since the parent process is responsible to create the child process, the parent process would not care the status of the childs.

由于 subprocess.call 是阻塞的,它不起作用.因此我使用 subprocess.Popen 来替换调用.然而,一旦孩子终止,Popen 将生成僵尸(已失效)进程(Link).

Since subprocess.call is blocking, it doesn't work. Therefore I use subprocess.Popen to replace call. However Popen will generate zombie (defunct) process once the child terminate(Link).

有没有办法解决这个问题?

Is there a way to solve this problem?

提前致谢


解决方案

有很多方法可以解决这个问题.关键是存在僵尸/失效"进程,以便父进程可以收集它们的状态.

There are a lot of ways to deal with this. The key point is that zombie / "defunct" processes exist so that the parent process can collect their statuses.

  1. 作为流程的创建者,您可以宣布您忽略该状态的意图.POSIX 方法是设置标志SA_NOCLDWAIT(使用sigaction).这在 Python 中有点痛苦.但大多数类 Unix 系统允许您简单地忽略 SIGCHLD/SIGCLD (拼写因类 Unix 系统而异),这在 Python 中很容易做到:

  1. As the creator of the process, you can announce your intent to ignore the status. The POSIX method is to set the flag SA_NOCLDWAIT (using sigaction). This is a bit of a pain to do in Python; but most Unix-like systems allow you to simply ignore SIGCHLD / SIGCLD (the spelling varies from one Unix-like system to another), which is easy to do in Python:

导入信号

signal.signal(signal.SIGCHLD, signal.SIG_IGN)

或者,如果由于某种原因它不可用或者在您的系统上不起作用,您可以使用一个旧的备用技巧:不要只分叉一次,分叉两次.在第一个孩子中,叉出第二个孩子;在第二个孩子中,使用 execve (或类似的)来运行所需的程序;然后在第一个孩子中,退出(使用 _exit).在原始父级中,使用 waitwaidpid 或操作系统提供的任何内容,并收集第一个子级的状态.

Or, if this is not available for some reason or does not work on your system, you can use an old stand-by trick: don't just fork once, fork twice. In the first child, fork a second child; in the second child, use execve (or similar) to run the desired program; and then in the first child, exit (with _exit). In the original parent, use wait or waidpid or whatever the OS provides, and collect the status of the first child.

这样做的原因是第二个孩子现在已经成为孤儿"(它的父母,第一个孩子,死了,并被您的原始进程收集).作为一个孤儿,它被移交给一个代理父代(特别是init"),它总是 wait-ing 并因此立即收集所有的僵尸.

The reason this works is that the second child has now become an "orphan" (its parent, the first child, died and was collected by your original process). As an orphan it is handed over to a proxy parent (specifically, to "init") which is always wait-ing and hence collects all the zombies right away.

除了双叉之外,您还可以让您的子进程在它们自己的单独会话中运行和/或放弃控制终端访问(守护进程",在 Unix-y 术语中).(这有点混乱,并且依赖于操作系统;我之前已经编写过代码,但是对于一些我现在无法访问的公司代码.)

In addition to the double fork, you can make your sub-processes live in their own separate session and/or give up controlling terminal access ("daemonize", in Unix-y terms). (This is a bit messy and OS-dependent; I've coded it before but for some corporate code I don't have access to now.)

最后,您可以简单地定期收集这些进程.如果您正在使用 subprocess 模块,只要方便,只需在每个进程上调用 .poll 函数即可.如果进程仍在运行,这将返回 None,如果已完成则返回退出状态(已收集它).如果有些仍在运行,您的主程序可以在它们继续运行时退出;到那时,它们就会变成孤儿,就像上面的方法 #2 一样.

Finally, you could simply collect those processes periodically. If you're using the subprocess module, simply call the .poll function on each process, whenever it seems convenient. This will return None if the process is still running, and the exit status (having collected it) if it has finished. If some are still running, your main program can exit anyway while they keep running; at that point, they become orphaned, as in method #2 above.

忽略 SIGCHLD"方法简单易行,但缺点是会干扰创建和等待子进程的库例程.Python 2.7 及更高版本 (http://bugs.python.org/issue15756) 中有一个解决方法,但这意味着该库例程在这些子流程中看不到任何故障.

The "ignore SIGCHLD" method is simple and easy but has the drawback of interfering with library routines that create and wait-for sub-processes. There's a work-around in Python 2.7 and later (http://bugs.python.org/issue15756) but it means the library routines can't see any failures in those sub-processes.

相关文章