2023年7月27日 16:01:40go评论123阅读模式

英文:

Python3: a subprocess sleeps when started by Popen

问题

我需要同时运行两个命令。每个命令执行处理的时间都很长（大约1分钟和近2分钟），并且两个命令都会在stdout和stderr流上产生大量字节（stderr约300kB，stdout几MB）。我需要捕获这两个流。

我过去使用subprocess.run()来执行它们，但这样我是串行执行的，而且由于执行的命令是单线程的，所以我考虑使用Popen进行并行化。

不幸的是，简单的方法不起作用：

class test:
    def __init__(self, param):
        cmd = "... %d" % param  # 使用param参数化的命令行
        self.__p = subprocess.Popen(shlex.split(cmd), shell=False, stdout=subprocess.PIPE, stderr=subprocess.PIPE, text=True)
    
    def waitTerm(self, t=None):
        self.__p.wait(t)
        if self.__p.returncode != 0:
            print(self.__p.stderr.read(), file=sys.stderr)
            raise Exception('failure')
        self.o = self.__p.stdout.read()
t1 = test(1)
t8 = test(8)
t1.waitTerm()  # t1应该更长
t8.waitTerm()
# 这里我可以使用两个进程的stdout
print(t1.o) # 这只是一个示例

进程在休眠中停止。我认为这是由于填充与管道相关的缓冲区所致。

在这种情况下，最明智的做法是什么？

英文:

I need to run two commands in parallel. Each takes a long time to perform its processing (about 1 minute one and almost 2 minutes the other), and both produce many bytes on the stdout and stderr streams (about 300kB on stderr and several MB on stdout). And I have to capture both streams.

I used to use subprocess.run() to execute them, but that way I was serializing, and since the commands executed are singlethread I thought of parallelization with Popen.

Unfortunately, the simple way doesn't work:

class test:
    def __init__(self, param):
        cmd = &quot;... %d&quot; % param  # the command line parametrized with param
        self.__p = subprocess.Popen(shlex.split(cmd), shell=False, stdout=subprocess.PIPE, stderr=subprocess.PIPE, text=True)
    
    def waitTerm(self, t=None):
        self.__p.wait(t)
        if self.__p.returncode != 0:
            print(self.__p.stderr.read(), file=sys.stderr)
            raise Exception(&#39;failure&#39;)
        self.o = self.__p.stdout.read()
t1 = test(1)
t8 = test(8)
t1.waitTerm()  # t1 should be longer
t8.waitTerm()
# here I can use stdout from both process
print(t1.o) # this is an example

Processes stop in sleep. I believe this is caused by filling the buffers related to the pipes.

In this case what is the smartest thing to do?

答案1

得分: 1

多线程和实时读取输出

您可以使用多线程来启动两个子进程，并在其产生时读取其输出。

这有助于避免阻塞PIPEs。subprocess.Popen.communicate()函数可用于捕获stdout和stderr。

以下是示例代码：

import threading
import subprocess
import shlex
class Test:
    def __init__(self, param):
        cmd = "... %d" % param  # 使用param参数化命令
        self.process = subprocess.Popen(shlex.split(cmd), shell=False, stdout=subprocess.PIPE, stderr=subprocess.PIPE, text=True)
    def reader_thread(self):
        self.stdout, self.stderr = self.process.communicate()
    def run(self):
        thread = threading.Thread(target=self.reader_thread)
        thread.start()
t1 = Test(1)
t8 = Test(8)
t1.run()
t8.run()
# 在继续之前，您可能希望等待两个线程都完成
t1.join()
t8.join()

英文:

Threading and reading output as it is produced

You can use threading to start two subprocesses and then read their output as it is produced.

This helps avoid blocking the PIPEs. The subprocess.Popen.communicate() function can be used to capture stdout and stderr.

Here's how:

import threading
import subprocess
import shlex
class Test:
    def __init__(self, param):
        cmd = &quot;... %d&quot; % param  # parametrized with param
        self.process = subprocess.Popen(shlex.split(cmd), shell=False, stdout=subprocess.PIPE, stderr=subprocess.PIPE, text=True)
    def reader_thread(self):
        self.stdout, self.stderr = self.process.communicate()
    def run(self):
        thread = threading.Thread(target=self.reader_thread)
        thread.start()
t1 = Test(1)
t8 = Test(8)
t1.run()
t8.run()
# You may want to wait for both threads to finish here before proceeding
t1.join()
t8.join()

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

Python3：当由Popen启动时，子进程会休眠

问题

答案1

Google Vision API分数和话题相关性的含义

Python and Starlette: running a long async task

如何使用Selenium Python在输入日期控件中发送日期，使用onkeydown=”return false”。

VS Code 无法找到 python libpython3.10.so。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。