并行的 Python 子进程
问题描述
我想并行运行多个进程,并能够随时获取标准输出.我该怎么做?我需要为每个 subprocess.Popen()
调用运行线程吗?
I want to run many processes in parallel with ability to take stdout in any time. How should I do it? Do I need to run thread for each subprocess.Popen()
call, a what?
解决方案
你可以在一个线程中完成.
You can do it in a single thread.
假设您有一个随机打印行的脚本:
Suppose you have a script that prints lines at random times:
#!/usr/bin/env python
#file: child.py
import os
import random
import sys
import time
for i in range(10):
print("%2d %s %s" % (int(sys.argv[1]), os.getpid(), i))
sys.stdout.flush()
time.sleep(random.random())
并且您想在输出可用时立即收集输出,您可以使用 select
在 POSIX 系统上作为 @zigg建议:
And you'd like to collect the output as soon as it becomes available, you could use select
on POSIX systems as @zigg suggested:
#!/usr/bin/env python
from __future__ import print_function
from select import select
from subprocess import Popen, PIPE
# start several subprocesses
processes = [Popen(['./child.py', str(i)], stdout=PIPE,
bufsize=1, close_fds=True,
universal_newlines=True)
for i in range(5)]
# read output
timeout = 0.1 # seconds
while processes:
# remove finished processes from the list (O(N**2))
for p in processes[:]:
if p.poll() is not None: # process ended
print(p.stdout.read(), end='') # read the rest
p.stdout.close()
processes.remove(p)
# wait until there is something to read
rlist = select([p.stdout for p in processes], [],[], timeout)[0]
# read a line from each process that has output ready
for f in rlist:
print(f.readline(), end='') #NOTE: it can block
更便携的解决方案(应该适用于 Windows、Linux、OSX)可以为每个进程使用读取器线程,请参阅 非阻塞在 python 中读取 subprocess.PIPE.
A more portable solution (that should work on Windows, Linux, OSX) can use reader threads for each process, see Non-blocking read on a subprocess.PIPE in python.
这里是 os.pipe()
基于 Unix 和 Windows 的解决方案:
Here's os.pipe()
-based solution that works on Unix and Windows:
#!/usr/bin/env python
from __future__ import print_function
import io
import os
import sys
from subprocess import Popen
ON_POSIX = 'posix' in sys.builtin_module_names
# create a pipe to get data
input_fd, output_fd = os.pipe()
# start several subprocesses
processes = [Popen([sys.executable, 'child.py', str(i)], stdout=output_fd,
close_fds=ON_POSIX) # close input_fd in children
for i in range(5)]
os.close(output_fd) # close unused end of the pipe
# read output line by line as soon as it is available
with io.open(input_fd, 'r', buffering=1) as file:
for line in file:
print(line, end='')
#
for p in processes:
p.wait()
相关文章