来自 subprocess.run() 的 CompletedProcess 不返回字符串
问题描述
根据 Python 3.5 文档,subprocess.run() 返回一个 CompletedProcess 对象,它的 stdout 成员包含一个字节序列,或者如果 run() 被调用时使用了universal_newlines=True,则为一个字符串".我只看到一个字节序列而不是一个字符串,我假设(希望)它相当于一个文本行.例如,
According to the Python 3.5 docs, subprocess.run() returns an a CompletedProcess object with a stdout member that contains "A bytes sequence, or a string if run() was called with universal_newlines=True." I'm only seeing a byte sequence and not a string, which I was assuming (hoping) would be equivalent to a text line. For example,
import pprint
import subprocess
my_data = ""
line_count = 0
proc = subprocess.run(
args = [ 'cat', 'input.txt' ],
universal_newlines = True,
stdout = subprocess.PIPE)
for text_line in proc.stdout:
my_data += text_line
line_count += 1
word_file = open('output.txt', 'w')
pprint.pprint(my_data, word_file)
pprint.pprint(line_count, word_file)
注意:这使用了 Python 3.5 中的一项新功能,该功能不会在以前的版本中运行.
Note: this uses a new feature in Python 3.5 that won't run in previous versions.
我是否需要创建自己的行缓冲逻辑,或者有没有办法让 Python 为我做这件事?
Do I need to create my own line buffering logic, or is there a way to get Python to do that for me?
解决方案
proc.stdout
在你的情况下已经是一个字符串,运行 print(type(proc.stdout))代码>,以确保.它包含所有子进程的输出——
subprocess.run()
直到子进程死亡才返回.
proc.stdout
is already a string in your case, run print(type(proc.stdout))
, to make sure. It contains all subprocess' output -- subprocess.run()
does not return until the child process is dead.
for text_line in proc.stdout:
不正确:for char in text_string
枚举 Python 中的字符(Unicode 代码点),而不是行.要获取线路,请致电:
for text_line in proc.stdout:
is incorrect: for char in text_string
enumerates characters (Unicode codepoints) in Python, not lines. To get lines, call:
lines = result.stdout.splitlines()
如果字符串中有 Unicode 换行符,结果可能与 .split('
')
不同.
The result may be different from .split('
')
if there are Unicode newlines in the string.
如果你想逐行读取输出(以避免长时间运行的进程耗尽内存):
If you want to read the output line by line (to avoid running out of memory for long-running processes):
from subrocess import Popen, PIPE
with Popen(command, stdout=PIPE, universal_newlines=True) as process:
for line in process.stdout:
do_something_with(line)
注意:process.stdout
在这种情况下是一个类似文件的对象.Popen()
不等待进程完成——Popen()
在子进程启动后立即返回.process
是一个 subprocess.Popen
实例,这里不是 CompletedProcess
.
Note: process.stdout
is a file-like object in this case. Popen()
does not wait for the process to finish -- Popen()
returns immidiately as soon as the child process is started. process
is a subprocess.Popen
instance, not CompletedProcess
here.
如果您只需要计算输出中的行数(以 b'
'
结尾),例如 wc -l
:
If all you need is to count the number of lines (terminated by b'
'
) in the output, like wc -l
:
from functools import partial
with Popen(command, stdout=PIPE) as process:
read_chunk = partial(process.stdout.read, 1 << 13)
line_count = sum(chunk.count(b'
') for chunk in iter(read_chunk, b''))
请参阅为什么在 C++ 中从标准输入读取行比 Python 慢得多?
相关文章