如何从Java启动独立的并发运行的Python进程

huangapple go评论78阅读模式
英文:

How to start independent, concurrently running Python processes from Java

问题

我有一个多线程的Java应用程序,通过Runtime.exec()调用一个Python程序。这个方法运行良好。我现在希望每个Java线程都启动自己的Python进程以实现并发。

虽然这个方法可以运行,但我遇到了一个问题,所有Python进程似乎都限制在单个CPU上运行,因此每个进程只使用CPU的一部分来运行。在top命令中,我可以看到我的n个Python进程。
n=1时,进程使用100%的CPU。
n=2时,两个进程各自使用约50%的CPU。
n=10时,所有进程都使用大约10%的CPU。

htop中,我可以看到只有两个CPU被使用:一个用于Java任务,另一个用于Python任务。

我曾以为运行多个Python进程会使它们彼此完全独立运行。

有什么想法和提示吗?谢谢!

编辑:以下是导致创建Python进程的代码。这不是一个最小的示例。如果这不够清晰,我可以创建一个最小示例。

        ProcessBuilder builder = new ProcessBuilder(new String[]{"-u", "-c", script});
        process = builder.start();
        errorStreamConsumer = new
                ErrorStreamConsumer(process.getErrorStream(), options.getTerminationSignalFromErrorStream(), Thread.currentThread());
        errorStreamConsumer.start();
        log.debug("Started process with arguments {}", Arrays.toString(arguments));
        BufferedInputStream bis = new BufferedInputStream(process.getInputStream());
        BufferedOutputStream bos = new BufferedOutputStream(process.getOutputStream());

其中script是完整的Python脚本作为一个字符串(不是文件名,而是实际的Python代码),ErrorStreamConsumer是一个线程,用于打印错误通道的内容。与进程的通信通过bisbos输入和输出流进行。

我为每个Java线程都这样做。一切都运行正常,除了Python进程似乎共享一个单独的CPU。

英文:

I have a multi-threaded Java application that calls a Python program via Runtime.exec(). This works fine. I now wanted each Java-Thread to start its own Python process for concurrency.
While this is working I ran into the issue that all Python processes seem to restrict themselves to a single CPU and thus each process only uses part of the CPU to run. In top I can see my n Python processes.
With n=1, the process uses 100% CPU.
With n=2, both processes use approx 50% CPU.
With n=10, all processes use around 10% CPU.

In htop I can see that only two CPUs are used: One for Java stuff and the other for the Python stuff.

I thought that running multiple Python processes would allow them to run completely independently from each other.

Ideas and hints? Thank you!

EDIT: Here is the code that leads to the creation of the Python processes. It's not a minimal example. I would create one if this isn't clear enough.

        ProcessBuilder builder = new ProcessBuilder(new String[]{"-u", "-c", script});
        process = builder.start();
        errorStreamConsumer = new
                ErrorStreamConsumer(process.getErrorStream(), options.getTerminationSignalFromErrorStream(), Thread.currentThread());
        errorStreamConsumer.start();
        log.debug("Started process with arguments {}", Arrays.toString(arguments));
        BufferedInputStream bis = new BufferedInputStream(process.getInputStream());
        BufferedOutputStream bos = new BufferedOutputStream(process.getOutputStream());

where script is the complete Python script as a String (NOT a file name but the actual Python code) the ErrorStreamConsumer is a thread printing out the error channel. The communication with the process runs over the the bis and bos input- and output streams.

I do this for each Java-Thread. And it works fine. Except that the Python processes seem to share a single CPU.

答案1

得分: 1

首先,感谢所有关心过我的问题的人。

虽然我对问题的真实性感到有些羞愧,但我觉得我应该分享一下问题所在。

我使用了SLURM来运行我的程序,这是一个作业调度程序,也可以限制CPU的使用。在我的情况下,它限制我的程序只能使用一个CPU。因此,根据定义,更多的CPU从未被使用过。

现在我意识到,我根本没有Python并发方面的问题。
更具体地说:
将脚本传递给Python解释器的“-c”参数在这里并不是问题。
此外,kiran的想法在这里不适用。我有一个Java进程和多个从Java进程启动的Python进程,现在它们消耗了我给予它们的尽可能多的CPU容量(我刚刚从Python进程中获得了3 x 300%的CPU使用率)。

因此,真正的问题在于我以为这是一个Python的问题。我为此道歉,并将来在描述我的问题时尽量更清楚。

英文:

First of all: Thanks for everyone who thought about my issue.

While I am a bit ashamed of the true nature of my problem I feel I should share what the issue was.

I ran my program with SLURM, a job scheduler which is also able to restrict CPU use. In my case it restricted my program to use a single CPU. Thus, more CPUs were never used by definition.

Now that I realized that I have no issues with Python concurrency at all.
To be more concrete:
The -c parameter to pass the script to the Python interpreter is not an issue here.
Also, the idea of kiran does not apply here. I have one Java Process and multiple Python processes started from the Java process and now they all consume as much CPU capacity as I would give them (I just had 3 x 300% of CPU usage from the Python processes).

So the real problem here was my assumption that this would be a Python issue. I apologize for this and will try to be clearer about my problem in the future.

答案2

得分: 0

Threads共享父进程的CPU。如果我们有5个线程,并不意味着我们可以利用所有5个核心,每个线程将共享主父进程的CPU/核心。
在您的情况下,10个线程共享了100%的CPU,因此您每个线程获得了10%。
现在,每个线程都在运行Python代码,占用10%的CPU,因此这就是您用于Python的计算能力。
我建议您使用多进程而不是多线程。比如,每个Java进程启动一个Python进程,您可以部署多个Java实例。

英文:

Threads share the cpu of the parent process. If we have 5 threads that doesn't mean we can make use of all 5 cores we got, each thread will the share the cpu/core of the main parent process.
In your case 10 threads were sharing 100% cpu so you got 10% to each.
Now each thread is running a python code with 10% cpu hence that is the computing power you got for python.
I suggest you to do multi processing instead of multithreading. Like, each java process starts a python process and you can deploy multiple instances of java.

huangapple
  • 本文由 发表于 2020年4月6日 16:34:39
  • 转载请务必保留本文链接:https://go.coder-hub.com/61055839.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定