英文:
Why startup probe is ignored?
问题
我部署了WAS到Kubernetes(版本1.16)。我使用了所有三种类型的探针。
Liveness探针设置为检查WAS进程是否正在运行以及所有打开的端口是否正在监听。Readiness探针通过HTTP GET调用WAS的健康检查API。启动探针使用与Liveness探针相同的逻辑,但还有一个额外的任务来初始化健康检查API。这意味着如果启动探针没有执行,健康检查API将不会启用,而就绪探针将始终失败。
我猜想
- 如果启动探针反复失败超过阈值,容器将被重新启动。
- 如果启动探针正常运行,那么就绪探针不应该失败。(请注意,如果启动探针也失败,那么健康检查API的启用也会失败,因此就绪探针不会失败,即使启动探针成功)。
总之,就绪探针不应该失败,因为它失败的唯一情况是容器重新启动或就绪探针成功。此外,启动探针的阈值为36次,间隔为5秒,因此活动性/就绪探针不应该运行180秒。然而,有情况下就绪探针在3分钟之前失败。
这让我相信启动探针的行为被覆盖,活动性/就绪探针被执行。
根据Kubernetes文档,启动探针是确保活动性/就绪探针在正确时间运行的探针。问题是,如果忽略了这个探针,通过initialDelaySeconds的绝对时间来定时不如绝对时间准确。
首先,我想知道我猜测的问题是否真的存在。我也不知道如何验证这一点。即使在K8s事件中,我只能看到就绪探针失败的事件,而看不到启动探针的成功/失败。也许我误解了启动探针的工作原理。我希望有人能提供正确的解决方案。
以下是我编写的探针配置。
livenessProbe:
exec:
command:
- liveness
initialDelaySeconds: 10
readinessProbe:
exec:
command:
- readiness
initialDelaySeconds: 10
startupProbe:
exec:
command:
- liveness
- -startup
failureThreshold: 36
periodSeconds: 5
使用类似kubectl describe和kubectl logs的命令来分析日志
检查K8s事件等。
英文:
I deployed WAS to Kubernetes(version 1.16). I used all three types of probes.
The Liveness probe is set to check if the WAS process is running and if all open ports are listening. The Readiness probe calls the healthcheck api of WAS via http get. The Startup probe uses the same logic as the Liveness probe, but has an additional task to init the healthcheck api. This means that if the Startup probe is not executed, the healthcheck api will not be enabled, and the readiness probe will always fail.
My guess is that
- if the Startup probe fails repeatedly beyond a threshold, the container will be restarted.
- if the startup probe ran normally, the readiness probe shouldn't fail. (Note that the startup probe also fails if the h.c. api enabling fails, so there is no case where the readiness probe fails even though the startup probe succeeds).
In conclusion, there should not be a situation where the readiness probe fails, because the only cases where it does are when the container is restarted or the readiness probe succeeds. In addition, the startup probe has a threshold of 36 times and a period of 5 seconds, so the liveness/readiness probe should not run for 180 seconds. However, there are cases where the readiness probe fails before 3 minutes.
This leads me to believe that the behavior of the startup probe is overridden and the liveness/readiness probe is executed.
According to the kubernetes docs, the startup probe is a probe to ensure that the liveness/readiness probe runs at the right time. The problem is that if this probe is ignored, timing in absolute time via initialDelaySeconds is not as good as timing in absolute time.
First of all, I'm wondering if the problem I'm guessing actually happens. I also don't know how to verify this. Even in k8s events, I could only see the readiness probe failed event, not the success/failure of the startup probe. Maybe I misunderstood how the startup probe works. I hope someone can provide a proper solution.
Below is the configuration of the probe I wrote.
livenessProbe:
exec:
command:
- liveness
initialDelaySeconds: 10
readinessProbe:
exec:
command:
- readiness
initialDelaySeconds: 10
startupProbe:
exec:
command:
- liveness
- -startup
failureThreshold: 36
periodSeconds: 5
Analyze logs with commands like kubectl describe and kubectl logs
Checking k8s events, etc.
答案1
得分: 2
启动探针在 1.16 版本中默认未启用。要使用此探针,您必须启用功能门。
英文:
The startup probe is not enabled by default for 1.16 version https://v1-22.docs.kubernetes.io/docs/reference/command-line-tools-reference/feature-gates/. To use this probe, you have to enable feature gate.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论