在Linux中,当服务停止运行时重新启动Python脚本。

huangapple go评论125阅读模式
英文:

Service relaunch Python script when it has stalled in Linux

问题

我正在尝试在Linux中将Python脚本作为服务运行。我找到了一些好的说明这里,以及如何在下一次运行时重新启动失败的脚本这里

但是,我有另一种情况,脚本不会因失败而中止,而只是停滞不前(它正在下载一些资源,但在99%处停滞不前)。当我手动运行它时,我可以观察到它停滞了1-2分钟,然后我强制中止脚本(CTRL-C),然后重新运行,它就可以正常工作。

如何使服务也能做到这一点?我可以将脚本的所有输出导向一个文件(现在输出正在导向STDOUT,我可以观察到停滞),有没有一种方法可以让服务观察到过去5分钟内未更新的导出文件,然后强制重新启动脚本,即使脚本已经处于运行状态(但停滞了)?

英文:

I am trying to run a Python script as service in Linux. I found some good instructions here, and how to restart a failed script in the next run here.

However, I have another scenario, where the script does not abort with failure, but it just stalls (it is downloading some resource, and it just stays stuck at 99%). When I run it manually, I can observe it stuck for 1-2 minutes, and then I force abort the script (CTRL-C) and rerun and it works fine.

How can I make the service do that as well? I can pipe all the output of the script to a file (right now the output is being piped to STDOUT, where I can observe the stalling), is there a way for the service to observe that the piped output file hasn't updated in last 5 minutes, and then so that it can force restart the script, even though the script was already in running mode (but stalled)?

答案1

得分: 1

不要回答我要翻译的问题。以下是要翻译的内容:

"Instead of monitoring the output, you could add a timeout on the function that may cause the script to stall. This is explained here.

Basically, that means creating a signal that, if not caught and handled, will raise an exception. This signal will then be handled when the function is complete (or will not be handled if it is stuck, of course).

An example from the thread I linked:

  1. In [1]: import signal
  2. # Register an handler for the timeout
  3. In [2]: def handler(signum, frame):
  4. ...: print("Forever is over!")
  5. ...: raise Exception("end of time")
  6. ...:
  7. # This function *may* run for an indetermined time...
  8. In [3]: def loop_forever():
  9. ...: import time
  10. ...: while 1:
  11. ...: print("sec")
  12. ...: time.sleep(1)
  13. ...:
  14. # Register the signal function handler
  15. In [4]: signal.signal(signal.SIGALRM, handler)
  16. Out[4]: 0
  17. # Define a timeout for your function
  18. In [5]: signal.alarm(10)
  19. Out[5]: 0
  20. In [6]: try:
  21. ...: loop_forever()
  22. ...: except Exception, exc:
  23. ...: print(exc)
  24. ....:
  25. sec
  26. sec
  27. sec
  28. sec
  29. sec
  30. sec
  31. sec
  32. sec
  33. Forever is over!
  34. end of time
  35. # Cancel the timer if the function returned before timeout
  36. # (ok, mine won't but yours maybe will :)
  37. In [7]: signal.alarm(0)
  38. Out[7]: 0

In the thread, there is also another explanation on how to do this with multiprocessing.Process, that looks like this:

  1. import multiprocessing
  2. import time
  3. # bar
  4. def bar():
  5. for i in range(100):
  6. print "Tick"
  7. time.sleep(1)
  8. if __name__ == '__main__':
  9. # Start bar as a process
  10. p = multiprocessing.Process(target=bar)
  11. p.start()
  12. # Wait for 10 seconds or until process finishes
  13. p.join(10)
  14. # If thread is still active
  15. if p.is_alive():
  16. print "running... let's kill it..."
  17. # Terminate - may not work if process is stuck for good
  18. p.terminate()
  19. # OR Kill - will work for sure, no chance for process to finish nicely however
  20. # p.kill()
  21. p.join()
英文:

Instead of monitoring the output, you could add a timeout on the function that may cause the script to stall. This is explained here.

Basically, that means creating a signal that, if not caught and handled, will raise an exception. This signal will then be handled when the function is complete (or will not be handled if it is stuck, of course).

An example from the thread I linked:

  1. In [1]: import signal
  2. # Register an handler for the timeout
  3. In [2]: def handler(signum, frame):
  4. ...: print("Forever is over!")
  5. ...: raise Exception("end of time")
  6. ...:
  7. # This function *may* run for an indetermined time...
  8. In [3]: def loop_forever():
  9. ...: import time
  10. ...: while 1:
  11. ...: print("sec")
  12. ...: time.sleep(1)
  13. ...:
  14. ...:
  15. # Register the signal function handler
  16. In [4]: signal.signal(signal.SIGALRM, handler)
  17. Out[4]: 0
  18. # Define a timeout for your function
  19. In [5]: signal.alarm(10)
  20. Out[5]: 0
  21. In [6]: try:
  22. ...: loop_forever()
  23. ...: except Exception, exc:
  24. ...: print(exc)
  25. ....:
  26. sec
  27. sec
  28. sec
  29. sec
  30. sec
  31. sec
  32. sec
  33. sec
  34. Forever is over!
  35. end of time
  36. # Cancel the timer if the function returned before timeout
  37. # (ok, mine won't but yours maybe will :)
  38. In [7]: signal.alarm(0)
  39. Out[7]: 0

In the thread, there is also another explanation on how to do this with multiprocessing.Process, that looks like this:

  1. import multiprocessing
  2. import time
  3. # bar
  4. def bar():
  5. for i in range(100):
  6. print "Tick"
  7. time.sleep(1)
  8. if __name__ == '__main__':
  9. # Start bar as a process
  10. p = multiprocessing.Process(target=bar)
  11. p.start()
  12. # Wait for 10 seconds or until process finishes
  13. p.join(10)
  14. # If thread is still active
  15. if p.is_alive():
  16. print "running... let's kill it..."
  17. # Terminate - may not work if process is stuck for good
  18. p.terminate()
  19. # OR Kill - will work for sure, no chance for process to finish nicely however
  20. # p.kill()
  21. p.join()

huangapple
  • 本文由 发表于 2023年7月17日 19:49:15
  • 转载请务必保留本文链接:https://go.coder-hub.com/76704151.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定