英文:
Submit multiple python jobs in parallel with different inputs from another python script
问题
我有一个Python脚本,需要提供标志参数,例如:
python test.py -i input_file -p parameter_1 -o output_file
我正在使用Python创建一个流水线(pipeline.py),我希望并行运行test.py,并使用不同的输入。
如果我像下面这样运行for循环,它会按顺序运行,但我想至少为5个输入提交并行作业,因为我有更多的CPU核心。
for inputs in input_file_list:
subprocess.run(['python', 'test.py', "-i "+inputs, "-p 10", "-o "+inputs+"_output.csv"])
我该如何做?
提前感谢!
英文:
I have a python script where I need to provide flagged args, eg:
python test.py -i input_file -p parameter_1 -o output_file
I am making a pipeline using python (pipeline.py) where I want to run test.py parallelly with different inputs.
If I run a for loop as below, it runs sequentially, but I want to submit parallel jobs for atleast 5 inputs since I have more number of cores.
for inputs in input_file_list:
subprocess.run(['python', 'test.py', "-i "+inputs, "-p 10", "-o "+inputs+"_output.csv"])
How can I do this?
Thanks in advance!
答案1
得分: 1
只需使用 asyncio
import asyncio
import os
def run_command(command: str = "") -> None:
os.system(command)
async def run_command_async(command: str = "") -> None:
await asyncio.to_thread(run_command, command)
async def main() -> None:
input_file_list = []
await asyncio.gather(
*[run_command_async(f"python test.py -i {inputs} -p 10 -o {inputs}_output.csv") for inputs in input_file_list]
)
if __name__ == "__main__":
asyncio.run(main())
英文:
Just use asyncio
import asyncio
import os
def run_command(command: str = "") -> None:
os.system(command)
async def run_command_async(command: str = "") -> None:
await asyncio.to_thread(run_command, command)
async def main() -> None:
input_file_list = []
await asyncio.gather(
*[run_command_async(f"python test.py -i {inputs} -p 10 -o {inputs}_output.csv") for inputs in input_file_list]
)
if __name__ == "__main__":
asyncio.run(main())
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论