来自 subprocess.run() 的 CompletedProcess 不返回字符串

2022-01-18 00:00:00 python subprocess python-3.5

问题描述

根据 Python 3.5 文档，subprocess.run() 返回一个 CompletedProcess 对象，它的 stdout 成员包含一个字节序列，或者如果 run() 被调用时使用了universal_newlines=True，则为一个字符串".我只看到一个字节序列而不是一个字符串，我假设(希望)它相当于一个文本行.例如，

According to the Python 3.5 docs, subprocess.run() returns an a CompletedProcess object with a stdout member that contains "A bytes sequence, or a string if run() was called with universal_newlines=True." I'm only seeing a byte sequence and not a string, which I was assuming (hoping) would be equivalent to a text line. For example,

import pprint import subprocess my_data = "" line_count = 0 proc = subprocess.run( args = [ 'cat', 'input.txt' ], universal_newlines = True, stdout = subprocess.PIPE) for text_line in proc.stdout: my_data += text_line line_count += 1 word_file = open('output.txt', 'w') pprint.pprint(my_data, word_file) pprint.pprint(line_count, word_file)

注意:这使用了 Python 3.5 中的一项新功能，该功能不会在以前的版本中运行.

Note: this uses a new feature in Python 3.5 that won't run in previous versions.

我是否需要创建自己的行缓冲逻辑，或者有没有办法让 Python 为我做这件事?

Do I need to create my own line buffering logic, or is there a way to get Python to do that for me?

解决方案

proc.stdout 在你的情况下已经是一个字符串，运行 print(type(proc.stdout))，以确保.它包含所有子进程的输出——subprocess.run() 直到子进程死亡才返回.
proc.stdout is already a string in your case, run print(type(proc.stdout)), to make sure. It contains all subprocess' output -- subprocess.run() does not return until the child process is dead. for text_line in proc.stdout: 不正确:for char in text_string 枚举 Python 中的字符(Unicode 代码点)，而不是行.要获取线路，请致电: for text_line in proc.stdout: is incorrect: for char in text_string enumerates characters (Unicode codepoints) in Python, not lines. To get lines, call: lines = result.stdout.splitlines() 如果字符串中有 Unicode 换行符，结果可能与 .split(' ') 不同. The result may be different from .split(' ') if there are Unicode newlines in the string. 如果你想逐行读取输出(以避免长时间运行的进程耗尽内存): If you want to read the output line by line (to avoid running out of memory for long-running processes): from subrocess import Popen, PIPE with Popen(command, stdout=PIPE, universal_newlines=True) as process: for line in process.stdout: do_something_with(line) 注意:process.stdout 在这种情况下是一个类似文件的对象.Popen() 不等待进程完成——Popen() 在子进程启动后立即返回.process 是一个 subprocess.Popen 实例，这里不是 CompletedProcess. Note: process.stdout is a file-like object in this case. Popen() does not wait for the process to finish -- Popen() returns immidiately as soon as the child process is started. process is a subprocess.Popen instance, not CompletedProcess here. 如果您只需要计算输出中的行数(以 b' ' 结尾)，例如 wc -l: If all you need is to count the number of lines (terminated by b' ') in the output, like wc -l: from functools import partial with Popen(command, stdout=PIPE) as process: read_chunk = partial(process.stdout.read, 1 << 13) line_count = sum(chunk.count(b' ') for chunk in iter(read_chunk, b'')) 请参阅为什么在 C++ 中从标准输入读取行比 Python 慢得多?


	
		相关文章