英文:
mmonit golang restarting slow and status does not exist
问题
我创建了一个名为monit的应用程序,它在golang站点
崩溃时必须重新启动。
$ cd /etc/monit/conf.d
$ vim checkSite
它使用nohup
启动程序,并将其pid
保存到文件中:
check process site with pidfile /root/go/path/to/goSite/run.pid
start program = "/bin/bash -c 'cd /root/go/path/to/goSitePath; nohup ./goSite > /dev/null 2>&1 & echo $! > run.pid'" with timeout 5 seconds
stop program = "/bin/kill -9 `cat /root/go/path/to/goSitePath/run.pid`"
它启动得很好。
进程 'site'
状态 运行中
监控状态 已监控
pid 29723
父pid 1
运行时间 2分钟
子进程 0
内存占用(KB) 8592
总内存占用(KB) 8592
内存占用百分比 0.4%
总内存占用百分比 0.4%
CPU占用百分比 0.0%
总CPU占用百分比 0.0%
数据收集时间 2015年3月5日 星期四 07:20:32
然后为了测试它在崩溃后如何重新启动,我手动杀死了golang站点
。
这里有两个问题:
- 站点重新启动的速度相当慢:尽管在配置中设置了
with timeout 5 seconds
,但需要1分钟才能重新启动。 monit
中site
的状态变为不存在
,即使站点实际上已经重新启动。我猜这是因为在杀死和重新启动站点后,pid
会随机更改,但我不知道如何解决这个问题。
重新启动后的状态:
进程 'site'
状态 不存在
监控状态 已监控
数据收集时间 2015年3月5日 星期四 08:04:44
如何缩短重新启动时间以及如何修复站点的monit状态
?
monit
日志:
[Mar 5 08:04:44] 错误 : 进程 'site' 未运行
[Mar 5 08:04:44] 信息 : 'site' 正在尝试重新启动
[Mar 5 08:04:44] 信息 : 'site' 启动: /bin/bash
[Mar 5 08:06:44] 信息 : 进程 'site' 正在运行,pid为31479
更新
我的golang站点相当简单:
package main
import (
"fmt"
"github.com/go-martini/martini"
)
func main() {
m := martini.Classic()
m.Get("/", func() {
fmt.Println("主页")
})
m.Run()
}
更新2
我尝试通过删除pid文件来加快monit重新加载我的golang站点。比如我执行了kill 29723 && rm run.pid
,然后打开计时器来计算站点再次可访问所需的时间。结果需要85秒。因此,删除pid文件并没有帮助monit加快重新加载站点的速度。
英文:
I created monit app that must restart golang site
on crash
$ cd /etc/monit/conf.d
$ vim checkSite
It starting program with nohup
and saving its pid
to file:
check process site with pidfile /root/go/path/to/goSite/run.pid
start program = "/bin/bash -c 'cd /root/go/path/to/goSitePath; nohup ./goSite > /dev/null 2>&1 & echo $! > run.pid'" with timeout 5 seconds
stop program = "/bin/kill -9 `cat /root/go/path/to/goSitePath/run.pid`"
It starts ok.
Process 'site'
status Running
monitoring status Monitored
pid 29723
parent pid 1
uptime 2m
children 0
memory kilobytes 8592
memory kilobytes total 8592
memory percent 0.4%
memory percent total 0.4%
cpu percent 0.0%
cpu percent total 0.0%
data collected Thu, 05 Mar 2015 07:20:32
Then to test how it will restart on crash I killed manually golang site
.
Here I have two issues:
- Site is restarted rather slow: it takes 1 minute although in configuration I set
with timeout 5 seconds
- Status of
site
inmonit
becomesDoes not exist
even after site in fact restarts. I guess this occurs because after killing and restarting site'spid
is changing randomly, but how to overcome this I don't know.
status after restart:
Process 'site'
status Does not exist
monitoring status Monitored
data collected Thu, 05 Mar 2015 08:04:44
How to reduce the time of restarting and how to repair site's monit status
?
monit
log:
[Mar 5 08:04:44] error : 'site' process is not running
[Mar 5 08:04:44] info : 'site' trying to restart
[Mar 5 08:04:44] info : 'site' start: /bin/bash
[Mar 5 08:06:44] info : 'site' process is running with pid 31479
Update
My golang site is rather simple:
package main
import (
"fmt"
"github.com/go-martini/martini"
)
func main() {
m := martini.Classic()
m.Get("/", func() {
fmt.Println("main page")
})
m.Run()
}
Update 2
I tried to increase speed of monit reload my golang site by removing pid file itself. Say I made kill 29723 && rm run.pid
and turned timer on to count time for site been accessible again. It took 85 seconds. So removing pid file did not help monit to increase speed of reloading site.
答案1
得分: 5
monit没有任何订阅机制来立即发现进程是否已经停止运行。
在守护模式下,正如文档中所述,monit通过定期轮询所有配置规则的状态来工作,其轮询周期在守护进程启动时进行配置,并在某些Linux发行版中默认为2分钟,这意味着在这种情况下,monit可能需要最多2分钟才能采取任何操作。
请检查monitrc中的此配置,它使用set daemon
指令进行配置,例如,如果您想每5秒检查一次状态,则应设置:
set daemon 5
在每个周期中,它会更新其状态,并根据此状态执行相应的操作。因此,如果它检测到进程不存在,即使它已经决定重新启动它,它也会在下一个轮询周期之前报告“不存在”。
start daemon
指令中的timeout
与此轮询周期无关,这是monit给予服务启动的时间。如果服务在此时间内未启动,monit将报告它。
如果monit不符合您的要求,您还可以尝试使用supervisord,它始终了解执行程序的状态。
英文:
monit doesn't have any subscription mechanism to inmediatelly discover if a process has died.
In daemon mode, as documented, monit works by periodically polling the status of all the configured rules, its poll-cycle is configured when daemon starts and defaults in some Linux distributions to 2 minutes, what means that in this case, monit can need till 2 minutes to take any action.
Check this configuration in your monitrc, it's configured with the set daemon
directive, for example, if you want to check the status every 5 seconds, then you should set:
set daemon 5
On every cycle it updates its status, and executes actions if needed depending on this. So if it detects that the process doesn't exist, it will report Does not exist
till the next poll cycle, even if it already takes the decission to restart it.
The timeout
in the start daemon
directive doesn't have anything to do with this poll-cycle, this is the time monit will give to the service to start. If the service doesn't start in this time monit will report it.
If monit doesn't meet your requirements, you can also try supervisord, that is always aware of the state of the executed programs.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论