Python中无限期的守护进程生成
问题描述
我正在尝试构建一个 Python 守护程序来启动其他完全独立的进程.
一般的想法是对于给定的 shell 命令,每隔几秒轮询一次,并确保该命令的 k 个实例正在运行.我们保留一个 pidfiles 目录,当我们轮询时,我们会删除那些 pids 不再运行的 pidfiles 并启动(并为其创建 pidfiles),但是我们需要访问其中的 k 个进程.p>
子进程也需要完全独立,这样如果父进程死了,子进程就不会被杀死.根据我的阅读,似乎没有办法使用 subprocess
模块来做到这一点.为此,我使用了这里提到的代码段:
http://code.activestate.com/recipes/66012-fork-a-daemon-process-on-unix/
我做了一些必要的修改(你会看到附加代码段中注释掉的行):
- 原来的父进程无法退出,因为我们需要启动器守护进程无限期地存在.
- 子进程需要以与父进程相同的 cwd 开始.
这是我的 spawn fn 和一个测试:
导入操作系统导入系统导入子流程进口时间def spawn(cmd,child_cwd):"""执行 UNIX 双叉魔法,请参阅 Stevens 的高级在 UNIX 环境中编程"了解详细信息 (ISBN 0201563177)http://www.erlenstar.demon.co.uk/unix/faq_2.html#SEC16"""尝试:pid = os.fork()如果 pid >0:# 退出第一个父级#sys.exit(0) # 父守护进程需要保持活动状态才能在未来启动更多返回除了 OSError,e:sys.stderr.write("fork #1 failed: %d (%s)
" % (e.errno, e.strerror))系统退出(1)# 与父环境解耦#os.chdir("/") # 我们希望子进程os.setsid()os.umask(0)# 进行第二次分叉尝试:pid = os.fork()如果 pid >0:# 从第二个父级退出系统退出(0)除了 OSError,e:sys.stderr.write("fork #2 failed: %d (%s)
" % (e.errno, e.strerror))系统退出(1)# 重定向标准文件描述符sys.stdout.flush()sys.stderr.flush()si = file('/dev/null', 'r')所以 = file('/dev/null', 'a+')se = file('/dev/null', 'a+', 0)os.dup2(si.fileno(),sys.stdin.fileno())os.dup2(so.fileno(), sys.stdout.fileno())os.dup2(se.fileno(), sys.stderr.fileno())pid = subprocess.Popen(cmd, cwd=child_cwd, shell=True).pid# 写入pid文件使用 open('pids/%s.pid' % pid, 'w') 作为 f: f.write(str(pid))系统退出(1)def mkdir_if_none(路径):如果不是 os.access(path, os.R_OK):os.mkdir(路径)如果 __name__ == '__main__':尝试:cmd = sys.argv[1]num = int(sys.argv[2])除了:print '用法:%s <cmd><进程数>'% __文件__系统退出(1)mkdir_if_none('pids')mkdir_if_none('test_cwd')对于 xrange(num) 中的 i:print '产生 %d...'%i产卵(cmd,'test_cwd')time.sleep(0.01) # 给系统一些喘息的空间
在这种情况下,一切似乎都很好,即使父进程被杀死,子进程也会继续存在.但是,我仍然遇到原始父母的产卵限制.大约 650 次生成后(不是同时,子进程已经完成),父进程因错误而窒息:
正在生成 650...fork #2 失败:35(资源暂时不可用)
有没有办法重写我的 spawn 函数,以便我可以无限期地生成这些独立的子进程?谢谢!
解决方案感谢您的进程列表 我愿意这么说这是因为您遇到了许多基本限制之一:
- rlimit
nproc
允许给定用户执行的最大进程数 -- 参见setrlimit(2)
,bash(1)
ulimit
内置,/etc/security/limits.conf
了解每个用户进程限制的详细信息. - rlimit
nofile
给定进程一次允许打开的最大文件描述符数.(每个新进程可能会在 parent 中创建三个新管道,分别用于子进程的stdin
、stdout
和stderr
描述符.) - 系统范围的最大进程数;请参阅
/proc/sys/kernel/pid_max
. - 系统范围内打开文件的最大数量;请参阅
/proc/sys/fs/file-max
.
因为你没有收割你死去的孩子,所以这些资源中的许多资源的开放时间都超过了应有的时间.您的 second 个孩子正在由 init(8)
正确处理——他们的父母已经死了,所以他们被重新设置为 init(8)
,而 init(8)
会在它们死后清理它们(wait(2)
).
但是,您的程序负责在 first 子集之后进行清理.C 程序通常为 SIGCHLD
安装一个 signal(7)
处理程序,该处理程序调用 wait(2)
或 waitpid(2)
获取子进程的退出状态,从而从内核内存中删除其条目.
但是脚本中的信号处理有点烦人.如果您可以将 SIGCHLD
信号处置显式设置为 SIG_IGN
,内核将知道您对退出状态不感兴趣,并为您收割子进程_.
尝试添加:
导入信号signal.signal(signal.SIGCHLD,signal.SIG_IGN)
靠近程序顶部.
请注意,我不知道这对 Subprocess
有什么作用.它可能不高兴.如果是这种情况,那么您需要安装信号处理程序来调用wait(2)
等你.
I'm trying to build a Python daemon that launches other fully independent processes.
The general idea is for a given shell command, poll every few seconds and ensure that exactly k instances of the command are running. We keep a directory of pidfiles, and when we poll we remove pidfiles whose pids are no longer running and start up (and make pidfiles for) however many processes we need to get to k of them.
The child processes also need to be fully independent, so that if the parent process dies the children won't be killed. From what I've read, it seems there is no way to do this with the subprocess
module. To this end, I used the snippet mentioned here:
http://code.activestate.com/recipes/66012-fork-a-daemon-process-on-unix/
I made a couple necessary modifications (you'll see the lines commented out in the attached snippet):
- The original parent process can't exit because we need the launcher daemon to persist indefinitely.
- The child processes need to start with the same cwd as the parent.
Here's my spawn fn and a test:
import os
import sys
import subprocess
import time
def spawn(cmd, child_cwd):
"""
do the UNIX double-fork magic, see Stevens' "Advanced
Programming in the UNIX Environment" for details (ISBN 0201563177)
http://www.erlenstar.demon.co.uk/unix/faq_2.html#SEC16
"""
try:
pid = os.fork()
if pid > 0:
# exit first parent
#sys.exit(0) # parent daemon needs to stay alive to launch more in the future
return
except OSError, e:
sys.stderr.write("fork #1 failed: %d (%s)
" % (e.errno, e.strerror))
sys.exit(1)
# decouple from parent environment
#os.chdir("/") # we want the children processes to
os.setsid()
os.umask(0)
# do second fork
try:
pid = os.fork()
if pid > 0:
# exit from second parent
sys.exit(0)
except OSError, e:
sys.stderr.write("fork #2 failed: %d (%s)
" % (e.errno, e.strerror))
sys.exit(1)
# redirect standard file descriptors
sys.stdout.flush()
sys.stderr.flush()
si = file('/dev/null', 'r')
so = file('/dev/null', 'a+')
se = file('/dev/null', 'a+', 0)
os.dup2(si.fileno(), sys.stdin.fileno())
os.dup2(so.fileno(), sys.stdout.fileno())
os.dup2(se.fileno(), sys.stderr.fileno())
pid = subprocess.Popen(cmd, cwd=child_cwd, shell=True).pid
# write pidfile
with open('pids/%s.pid' % pid, 'w') as f: f.write(str(pid))
sys.exit(1)
def mkdir_if_none(path):
if not os.access(path, os.R_OK):
os.mkdir(path)
if __name__ == '__main__':
try:
cmd = sys.argv[1]
num = int(sys.argv[2])
except:
print 'Usage: %s <cmd> <num procs>' % __file__
sys.exit(1)
mkdir_if_none('pids')
mkdir_if_none('test_cwd')
for i in xrange(num):
print 'spawning %d...'%i
spawn(cmd, 'test_cwd')
time.sleep(0.01) # give the system some breathing room
In this situation, things seem to work fine, and the child processes persist even when the parent is killed. However, I'm still running into a spawn limit on the original parent. After ~650 spawns (not concurrently, the children have finished) the parent process chokes with the error:
spawning 650...
fork #2 failed: 35 (Resource temporarily unavailable)
Is there any way to rewrite my spawn function so that I can spawn these independent child processes indefinitely? Thanks!
解决方案Thanks to your list of processes I'm willing to say that this is because you have hit one of a number of fundamental limitations:
- rlimit
nproc
maximum number of processes a given user is allowed to execute -- seesetrlimit(2)
, thebash(1)
ulimit
built-in, and/etc/security/limits.conf
for details on per-user process limits. - rlimit
nofile
maximum number of file descriptors a given process is allowed to have open at once. (Each new process probably creates three new pipes in the parent, for the child'sstdin
,stdout
, andstderr
descriptors.) - System-wide maximum number of processes; see
/proc/sys/kernel/pid_max
. - System-wide maximum number of open files; see
/proc/sys/fs/file-max
.
Because you're not reaping your dead children, many of these resources are held open longer than they should. Your second children are being properly handled by init(8)
-- their parent is dead, so they are re-parented to init(8)
, and init(8)
will clean up after them (wait(2)
) when they die.
However, your program is responsible for cleaning up after the first set of children. C programs typically install a signal(7)
handler for SIGCHLD
that calls wait(2)
or waitpid(2)
to reap the children's exit status and thus remove its entries from the kernel's memory.
But signal handling in a script is a bit annoying. If you can set the SIGCHLD
signal disposition to SIG_IGN
explicitly, the kernel will know that you are not interested in the exit status and will reap the children for you_.
Try adding:
import signal
signal.signal(signal.SIGCHLD, signal.SIG_IGN)
near the top of your program.
Note that I don't know what this does for Subprocess
. It might not be pleased. If that is the case, then you'll need to install a signal handler to call wait(2)
for you.
相关文章