检查 python 中正在运行的子进程的标准输出

问题描述

如果需要定期检查正在运行的进程的stdout.例如,进程是 tail -f/tmp/file,它是在 python 脚本中产生的.然后每 x 秒,该子进程的标准输出被写入一个字符串并进一步处理.子进程最终被脚本停止.

If need to periodically check the stdout of a running process. For example, the process is tail -f /tmp/file, which is spawned in the python script. Then every x seconds, the stdout of that subprocess is written to a string and further processed. The subprocess is eventually stopped by the script.

要解析子进程的标准输出,如果使用 check_output 直到现在,这似乎不起作用,因为该进程仍在运行并且不会产生明确的输出.

To parse the stdout of a subprocess, if used check_output until now, which doesn't seem to work, as the process is still running and doesn't produce a definite output.

>>> from subprocess import check_output
>>> out = check_output(["tail", "-f", "/tmp/file"])
 #(waiting for tail to finish)

应该可以为子进程使用线程,以便可以处理多个子进程的输出(例如tail -f/tmp/file1,tail -f/tmp/file2).

It should be possible to use threads for the subprocesses, so that the output of multiple subprocesses may be processed (e.g. tail -f /tmp/file1, tail -f /tmp/file2).

如何启动子进程、定期检查和处理其标准输出并最终以多线程友好的方式停止子进程?python脚本在Linux系统上运行.

How can I start a subprocess, periodically check and process its stdout and eventually stop the subprocess in a multithreading friendly way? The python script runs on a Linux system.

目标不是连续读取文件,tail 命令就是一个示例,因为它的行为与实际使用的命令完全相同.

The goal is not to continuously read a file, the tail command is an example, as it behaves exactly like the actual command used.

我没有想到这一点,该文件不存在.check_output 现在只是等待进程完成.

edit: I didn't think this through, the file did not exist. check_output now simply waits for the process to finish.

edit2: 另一种方法,使用 PopenPIPE 似乎会导致相同的问题.它等待 tail 完成.

edit2: An alternative method, with Popen and PIPE appears to result in the same issue. It waits for tail to finish.

>>> from subprocess import Popen, PIPE, STDOUT
>>> cmd = 'tail -f /tmp/file'
>>> p = Popen(cmd, shell=True, stdin=PIPE, stdout=PIPE, stderr=STDOUT, close_fds=True)
>>> output = p.stdout.read()
 #(waiting for tail to finish)


解决方案

你的第二次尝试是 90% 正确的.唯一的问题是您试图在完成后同时读取 tail 的标准输出的所有.但是,tail 旨在(无限期地?)在后台运行,因此您真的想逐行读取标准输出:

Your second attempt is 90% correct. The only issue is that you are attempting to read all of tail's stdout at the same time once it's finished. However, tail is intended to run (indefinitely?) in the background, so you really want to read stdout from it line-by-line:

from subprocess import Popen, PIPE, STDOUT
p = Popen(["tail", "-f", "/tmp/file"], stdin=PIPE, stdout=PIPE, stderr=STDOUT)
for line in p.stdout:
    print(line)

我已删除 shell=Trueclose_fds=True 参数.第一个是不必要的并且有潜在危险,而第二个只是默认设置.

I have removed the shell=True and close_fds=True arguments. The first is unnecessary and potentially dangerous, while the second is just the default.

请记住,文件对象在 Python 中的行上是可迭代的.for 循环将一直运行,直到 tail 终止,但它会处理出现的每一行,而不是 read,后者将阻塞直到 tail 死了.

Remember that file objects are iterable over their lines in Python. The for loop will run until tail dies, but it will process each line as it appears, as opposed to read, which will block until tail dies.

如果我在 /tmp/file 中创建一个空文件,启动这个程序并开始使用另一个 shell 将行回显到文件中,程序将回显这些行.您可能应该将 print 替换为更有用的东西.

If I create an empty file in /tmp/file, start this program and begin echoing lines into the file using another shell, the program will echo those lines. You should probably replace print with something a bit more useful.

这是我在启动上面的代码后键入的命令示例:

Here is an example of commands I typed after starting the code above:

命令行

$ echo a > /tmp/file
$ echo b > /tmp/file
$ echo c >> /tmp/file

程序输出(来自不同 shell 中的 Python)

b'a
'
b'tail: /tmp/file: file truncated
'
b'b
'
b'c
'

如果您希望主程序在响应 tail 的输出时响应,请在单独的线程中启动循环.你应该让这个线程成为一个守护进程,这样即使 tail 没有完成,它也不会阻止你的程序退出.您可以让线程打开子进程,也可以将标准输出传递给它.我更喜欢后一种方法,因为它可以让您在主线程中获得更多控制:

In the case that you want your main program be responsive while you respond to the output of tail, start the loop in a separate thread. You should make this thread a daemon so that it does not prevent your program from exiting even if tail is not finished. You can have the thread open the sub-process or you can just pass in the standard output to it. I prefer the latter approach since it gives you more control in the main thread:

def deal_with_stdout():
    for line in p.stdout:
        print(line)

from subprocess import Popen, PIPE, STDOUT
from threading import Thread
p = Popen(["tail", "-f", "/tmp/file"], stdin=PIPE, stdout=PIPE, stderr=STDOUT)
t = Thread(target=deal_with_stdout, daemon=True)
t.start()
t.join()

这里的代码几乎相同,只是增加了一个新线程.我在末尾添加了一个 join() ,这样程序就可以作为示例运行(join 等待线程在返回之前终止).您可能想用您通常运行的任何处理代码替换它.

The code here is nearly identical, with the addition of a new thread. I added a join() at the end so the program would behave well as an example (join waits for the thread to die before returning). You probably want to replace that with whatever processing code you would normally be running.

如果您的线程足够复杂,您可能还想从 Thread 继承并覆盖 run 方法,而不是传入简单的 target.

If your thread is complex enough, you may also want to inherit from Thread and override the run method instead of passing in a simple target.

相关文章