从子进程中实时捕获标准输出
问题描述
我想在 Windows 中 subprocess.Popen()
rsync.exe,并在 Python 中打印标准输出.
I want to subprocess.Popen()
rsync.exe in Windows, and print the stdout in Python.
我的代码可以工作,但在文件传输完成之前它无法捕捉进度!我想实时打印每个文件的进度.
My code works, but it doesn't catch the progress until a file transfer is done! I want to print the progress for each file in real time.
现在使用 Python 3.1,因为我听说它应该更好地处理 IO.
Using Python 3.1 now since I heard it should be better at handling IO.
import subprocess, time, os, sys
cmd = "rsync.exe -vaz -P source/ dest/"
p, line = True, 'start'
p = subprocess.Popen(cmd,
shell=True,
bufsize=64,
stdin=subprocess.PIPE,
stderr=subprocess.PIPE,
stdout=subprocess.PIPE)
for line in p.stdout:
print(">>> " + str(line.rstrip()))
p.stdout.flush()
解决方案
subprocess
的一些经验法则.
- 从不使用
shell=True
.它不必要地调用一个额外的 shell 进程来调用您的程序. - 调用进程时,参数作为列表传递.python 中的
sys.argv
是一个列表,C 中的argv
也是如此.所以你将 list 传递给Popen
调用子进程,而不是字符串. - 不阅读时不要将
stderr
重定向到PIPE
. - 当你不写的时候不要重定向
stdin
.
- Never use
shell=True
. It needlessly invokes an extra shell process to call your program. - When calling processes, arguments are passed around as lists.
sys.argv
in python is a list, and so isargv
in C. So you pass a list toPopen
to call subprocesses, not a string. - Don't redirect
stderr
to aPIPE
when you're not reading it. - Don't redirect
stdin
when you're not writing to it.
例子:
import subprocess, time, os, sys
cmd = ["rsync.exe", "-vaz", "-P", "source/" ,"dest/"]
p = subprocess.Popen(cmd,
stdout=subprocess.PIPE,
stderr=subprocess.STDOUT)
for line in iter(p.stdout.readline, b''):
print(">>> " + line.rstrip())
也就是说,当 rsync 检测到它连接到管道而不是终端时,它可能会缓冲其输出.这是默认行为 - 当连接到管道时,程序必须显式刷新标准输出以获得实时结果,否则标准 C 库将缓冲.
That said, it is probable that rsync buffers its output when it detects that it is connected to a pipe instead of a terminal. This is the default behavior - when connected to a pipe, programs must explicitly flush stdout for realtime results, otherwise standard C library will buffer.
要对此进行测试,请尝试运行它:
To test for that, try running this instead:
cmd = [sys.executable, 'test_out.py']
并创建一个包含以下内容的 test_out.py
文件:
and create a test_out.py
file with the contents:
import sys
import time
print ("Hello")
sys.stdout.flush()
time.sleep(10)
print ("World")
执行该子进程应该给您Hello"并等待 10 秒,然后再给World".如果上面的 python 代码而不是 rsync
发生这种情况,这意味着 rsync
本身正在缓冲输出,所以你不走运.
Executing that subprocess should give you "Hello" and wait 10 seconds before giving "World". If that happens with the python code above and not with rsync
, that means rsync
itself is buffering output, so you are out of luck.
一种解决方案是直接连接到 pty
,使用类似 pexpect
之类的东西.
A solution would be to connect direct to a pty
, using something like pexpect
.
相关文章