英文:
How does docker kill the healthcheck process after timeout?
问题
我正在尝试逆向工程或查找Go代码(我已经搜索了代码但没有找到)以确定Docker如何在超时时终止健康检查命令。
这个GitHub上的问题似乎暗示着有一个SIGTERM
,然后是一个SIGKILL
,但我无法捕获它并记录它以查看两者之间的延迟。strace dockerd
也没有显示给我它是如何做到的,跟踪docker run
也不行。
以下代码在Docker运行期间只记录数字0到4:
$ cat healthcheck.c
#include <signal.h>
#include <unistd.h>
#include <stdio.h>
FILE *fp = NULL;
void catch(int signo) {
fprintf(fp, "caught: %d\n", signo);
fflush(fp);
}
int main(void) {
signal(SIGTERM, catch);
signal(SIGINT, catch);
signal(SIGHUP, catch);
fp = fopen("/tmp/healthcheck.log", "a");
for(int i = 0; i < 3600; i++) {
sleep(1);
fprintf(fp, "%d\n", i);
fflush(fp);
}
return 0;
}
Dockerfile:
FROM ubuntu:22.04 as base
COPY healthcheck /
HEALTHCHECK --interval=1s --timeout=5s CMD ["/healthcheck"]
CMD sleep 60
英文:
I'm trying to reverse engineer or find the go code (I searched through the code to no avail) to determine how docker kills a healthcheck command at a timeout.
This issue on github seems to imply there is a SIGTERM
followed by a SIGKILL
, but I am unable to trap it and log it to see the delay between the two. An strace
dockerd
hasn't shown me how it does it either, nor does tracing the docker run
.
The following code just logs the numbers 0-4 during the docker run:
$ cat healthcheck.c
#include <signal.h>
#include <unistd.h>
#include <stdio.h>
FILE *fp = NULL;
void catch(int signo) {
fprintf(fp, "caught: %d\n", signo);
fflush(fp);
}
int main(void) {
signal(SIGTERM, catch);
signal(SIGINT, catch);
signal(SIGHUP, catch);
fp = fopen("/tmp/healthcheck.log", "a");
for(int i = 0; i < 3600; i++) {
sleep(1);
fprintf(fp, "%d\n", i);
fflush(fp);
}
return 0;
}
Dockerfile:
FROM ubuntu:22.04 as base
COPY healthcheck /
HEALTHCHECK --interval=1s --timeout=5s CMD ["/healthcheck"]
CMD sleep 60
答案1
得分: 1
根据该项目上的问题,SIGKILL
现在是终止健康检查的标准方式。引用如下,
[...] 健康检查不应该具有副作用,也不应该在超时时需要“优雅”关闭。
英文:
Per this issue on the project, SIGKILL
is the standard way the healthcheck is terminated now. To quote,
> [...] health checks are not expected to have side-effects, and not expected to require a "graceful" shutdown when they time out.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论