Python3:当由Popen启动时,子进程会休眠

huangapple go评论74阅读模式
英文:

Python3: a subprocess sleeps when started by Popen

问题

我需要同时运行两个命令。每个命令执行处理的时间都很长(大约1分钟和近2分钟),并且两个命令都会在stdout和stderr流上产生大量字节(stderr约300kB,stdout几MB)。我需要捕获这两个流。

我过去使用subprocess.run()来执行它们,但这样我是串行执行的,而且由于执行的命令是单线程的,所以我考虑使用Popen进行并行化。

不幸的是,简单的方法不起作用:

class test:
    def __init__(self, param):
        cmd = "... %d" % param  # 使用param参数化的命令行
        self.__p = subprocess.Popen(shlex.split(cmd), shell=False, stdout=subprocess.PIPE, stderr=subprocess.PIPE, text=True)
    
    def waitTerm(self, t=None):
        self.__p.wait(t)
        if self.__p.returncode != 0:
            print(self.__p.stderr.read(), file=sys.stderr)
            raise Exception('failure')
        self.o = self.__p.stdout.read()

t1 = test(1)
t8 = test(8)

t1.waitTerm()  # t1应该更长
t8.waitTerm()

# 这里我可以使用两个进程的stdout
print(t1.o) # 这只是一个示例

进程在休眠中停止。我认为这是由于填充与管道相关的缓冲区所致。

在这种情况下,最明智的做法是什么?

英文:

I need to run two commands in parallel. Each takes a long time to perform its processing (about 1 minute one and almost 2 minutes the other), and both produce many bytes on the stdout and stderr streams (about 300kB on stderr and several MB on stdout). And I have to capture both streams.

I used to use subprocess.run() to execute them, but that way I was serializing, and since the commands executed are singlethread I thought of parallelization with Popen.

Unfortunately, the simple way doesn't work:

class test:
    def __init__(self, param):
        cmd = "... %d" % param  # the command line parametrized with param
        self.__p = subprocess.Popen(shlex.split(cmd), shell=False, stdout=subprocess.PIPE, stderr=subprocess.PIPE, text=True)
    
    def waitTerm(self, t=None):
        self.__p.wait(t)
        if self.__p.returncode != 0:
            print(self.__p.stderr.read(), file=sys.stderr)
            raise Exception('failure')
        self.o = self.__p.stdout.read()

t1 = test(1)
t8 = test(8)

t1.waitTerm()  # t1 should be longer
t8.waitTerm()

# here I can use stdout from both process
print(t1.o) # this is an example

Processes stop in sleep. I believe this is caused by filling the buffers related to the pipes.

In this case what is the smartest thing to do?

答案1

得分: 1

多线程和实时读取输出

您可以使用多线程来启动两个子进程,并在其产生时读取其输出。

这有助于避免阻塞PIPEs。subprocess.Popen.communicate()函数可用于捕获stdoutstderr

以下是示例代码:

import threading
import subprocess
import shlex

class Test:
    def __init__(self, param):
        cmd = "... %d" % param  # 使用param参数化命令
        self.process = subprocess.Popen(shlex.split(cmd), shell=False, stdout=subprocess.PIPE, stderr=subprocess.PIPE, text=True)

    def reader_thread(self):
        self.stdout, self.stderr = self.process.communicate()

    def run(self):
        thread = threading.Thread(target=self.reader_thread)
        thread.start()

t1 = Test(1)
t8 = Test(8)

t1.run()
t8.run()

# 在继续之前,您可能希望等待两个线程都完成
t1.join()
t8.join()
英文:

Threading and reading output as it is produced

You can use threading to start two subprocesses and then read their output as it is produced.

This helps avoid blocking the PIPEs. The subprocess.Popen.communicate() function can be used to capture stdout and stderr.

Here's how:

import threading
import subprocess
import shlex

class Test:
    def __init__(self, param):
        cmd = "... %d" % param  # parametrized with param
        self.process = subprocess.Popen(shlex.split(cmd), shell=False, stdout=subprocess.PIPE, stderr=subprocess.PIPE, text=True)

    def reader_thread(self):
        self.stdout, self.stderr = self.process.communicate()

    def run(self):
        thread = threading.Thread(target=self.reader_thread)
        thread.start()

t1 = Test(1)
t8 = Test(8)

t1.run()
t8.run()

# You may want to wait for both threads to finish here before proceeding
t1.join()
t8.join()

huangapple
  • 本文由 发表于 2023年7月27日 16:01:40
  • 转载请务必保留本文链接:https://go.coder-hub.com/76777656.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定