在Linux中,当服务停止运行时重新启动Python脚本。

huangapple go评论73阅读模式
英文:

Service relaunch Python script when it has stalled in Linux

问题

我正在尝试在Linux中将Python脚本作为服务运行。我找到了一些好的说明这里,以及如何在下一次运行时重新启动失败的脚本这里

但是,我有另一种情况,脚本不会因失败而中止,而只是停滞不前(它正在下载一些资源,但在99%处停滞不前)。当我手动运行它时,我可以观察到它停滞了1-2分钟,然后我强制中止脚本(CTRL-C),然后重新运行,它就可以正常工作。

如何使服务也能做到这一点?我可以将脚本的所有输出导向一个文件(现在输出正在导向STDOUT,我可以观察到停滞),有没有一种方法可以让服务观察到过去5分钟内未更新的导出文件,然后强制重新启动脚本,即使脚本已经处于运行状态(但停滞了)?

英文:

I am trying to run a Python script as service in Linux. I found some good instructions here, and how to restart a failed script in the next run here.

However, I have another scenario, where the script does not abort with failure, but it just stalls (it is downloading some resource, and it just stays stuck at 99%). When I run it manually, I can observe it stuck for 1-2 minutes, and then I force abort the script (CTRL-C) and rerun and it works fine.

How can I make the service do that as well? I can pipe all the output of the script to a file (right now the output is being piped to STDOUT, where I can observe the stalling), is there a way for the service to observe that the piped output file hasn't updated in last 5 minutes, and then so that it can force restart the script, even though the script was already in running mode (but stalled)?

答案1

得分: 1

不要回答我要翻译的问题。以下是要翻译的内容:

"Instead of monitoring the output, you could add a timeout on the function that may cause the script to stall. This is explained here.

Basically, that means creating a signal that, if not caught and handled, will raise an exception. This signal will then be handled when the function is complete (or will not be handled if it is stuck, of course).

An example from the thread I linked:

In [1]: import signal

# Register an handler for the timeout
In [2]: def handler(signum, frame):
   ...:     print("Forever is over!")
   ...:     raise Exception("end of time")
   ...:

# This function *may* run for an indetermined time...
In [3]: def loop_forever():
   ...:     import time
   ...:     while 1:
   ...:         print("sec")
   ...:         time.sleep(1)
   ...:

# Register the signal function handler
In [4]: signal.signal(signal.SIGALRM, handler)
Out[4]: 0

# Define a timeout for your function
In [5]: signal.alarm(10)
Out[5]: 0

In [6]: try:
   ...:     loop_forever()
   ...: except Exception, exc:
   ...:     print(exc)
   ....:
sec
sec
sec
sec
sec
sec
sec
sec
Forever is over!
end of time

# Cancel the timer if the function returned before timeout
# (ok, mine won't but yours maybe will :)
In [7]: signal.alarm(0)
Out[7]: 0

In the thread, there is also another explanation on how to do this with multiprocessing.Process, that looks like this:

import multiprocessing
import time

# bar
def bar():
    for i in range(100):
        print "Tick"
        time.sleep(1)

if __name__ == '__main__':
    # Start bar as a process
    p = multiprocessing.Process(target=bar)
    p.start()

    # Wait for 10 seconds or until process finishes
    p.join(10)

    # If thread is still active
    if p.is_alive():
        print "running... let's kill it..."

        # Terminate - may not work if process is stuck for good
        p.terminate()
        # OR Kill - will work for sure, no chance for process to finish nicely however
        # p.kill()

        p.join()
英文:

Instead of monitoring the output, you could add a timeout on the function that may cause the script to stall. This is explained here.

Basically, that means creating a signal that, if not caught and handled, will raise an exception. This signal will then be handled when the function is complete (or will not be handled if it is stuck, of course).

An example from the thread I linked:

In [1]: import signal

# Register an handler for the timeout
In [2]: def handler(signum, frame):
   ...:     print("Forever is over!")
   ...:     raise Exception("end of time")
   ...: 

# This function *may* run for an indetermined time...
In [3]: def loop_forever():
   ...:     import time
   ...:     while 1:
   ...:         print("sec")
   ...:         time.sleep(1)
   ...:         
   ...:         

# Register the signal function handler
In [4]: signal.signal(signal.SIGALRM, handler)
Out[4]: 0

# Define a timeout for your function
In [5]: signal.alarm(10)
Out[5]: 0

In [6]: try:
   ...:     loop_forever()
   ...: except Exception, exc: 
   ...:     print(exc)
   ....: 
sec
sec
sec
sec
sec
sec
sec
sec
Forever is over!
end of time

# Cancel the timer if the function returned before timeout
# (ok, mine won't but yours maybe will :)
In [7]: signal.alarm(0)
Out[7]: 0

In the thread, there is also another explanation on how to do this with multiprocessing.Process, that looks like this:

import multiprocessing
import time

# bar
def bar():
    for i in range(100):
        print "Tick"
        time.sleep(1)

if __name__ == '__main__':
    # Start bar as a process
    p = multiprocessing.Process(target=bar)
    p.start()

    # Wait for 10 seconds or until process finishes
    p.join(10)

    # If thread is still active
    if p.is_alive():
        print "running... let's kill it..."

        # Terminate - may not work if process is stuck for good
        p.terminate()
        # OR Kill - will work for sure, no chance for process to finish nicely however
        # p.kill()

        p.join()

huangapple
  • 本文由 发表于 2023年7月17日 19:49:15
  • 转载请务必保留本文链接:https://go.coder-hub.com/76704151.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定