英文:
How to make the Python subprocess wait for some input when running through SLURM script?
问题
我正在使用SSH访问的远程服务器上的SLURM脚本运行一些Python代码。在某些情况下,SLURM平台上的许可证问题可能会导致Python出现错误并终止子进程。我想使用try-except来让Python子进程等待,直到问题修复,然后它可以从停止的地方继续运行。
最明显的解决方案是,如果出现错误,将Python保持在循环中,让它每隔X秒读取一个文件,当我最终修复错误并希望它从停止的地方继续运行时,我会在文件上写入一些内容并跳出循环。我想知道是否有更智能的方法来在Python子进程运行SLURM脚本时提供输入。
英文:
I am running some Python code using a SLURM script on a remote server accessed through SSH. At some point, issues related to licenses on the SLURM platform may happen, generating errors in Python and ending the subprocess. I want to use try-except to let the Python subprocess wait until the issue is fixed, after that it can keep running from where it stopped.
What are some smart implementations for that?
My most obvious solution is just keeping Python inside a loop if the error occurs and letting it read a file every X seconds, when I finally fix the error and want it to keep running from where it stopped, I would write something on the file and break the loop. I wonder if there is a smarter way to provide input to the Python subprocess while it is running through the SLURM script.
答案1
得分: 1
One idea might be to add a signal handler for signal USR1 to your Python script like this.
In the signal handler function, you can set a global variable or send a message or set a threading.Event
that the main process is waiting on.
Then you can signal the process with:
kill -USR1 <PID>
or with the Python os.kill() equivalent.
Though I do have to agree there is something to be said for the simplicity of your process doing:
touch /tmp/blocked.$$
and your program waiting in a loop with a 1s sleep for that file to be removed. This way you can tell which process id is blocked.
英文:
One idea might be to add a signal handler for signal USR1 to your Python script like this.
In the signal handler function, you can set a global variable or send a message or set a threading.Event
that the main process is waiting on.
Then you can signal the process with:
kill -USR1 <PID>
or with the Python os.kill() equivalent.
Though I do have to agree there is something to be said for the simplicity of your process doing:
touch /tmp/blocked.$$
and your program waiting in a loop with a 1s sleep for that file to be removed. This way you can tell which process id is blocked.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论