在什么情况下已完成的Pod将不会被回收。

huangapple go评论58阅读模式
英文:

Under what circumstances will the completed pod not be recycled

问题

最近,在k8s集群上已经完成了很多pod,我怀疑这与集群资源不足有关。

我创建的nextflow任务包含多个进程,通常它们将按顺序执行,并在前一个pod完成后创建一个新的pod。但最近,在集群上提交了大量任务。在观察过程中,许多已完成的pod出现了,任务卡住了。我想知道这是否与集群资源或nextflow有关,如果这个pod一直无法重新启动,还会发生什么?

英文:

Recently, there have been a lot of completed pods on the k8s cluster, which I suspect is related to insufficient cluster resources

The nextflow task I created contains several processes, usually they will be executed sequentially, and a new pod will be created after the previous pod is completed. But recently, a large number of tasks have been submitted on the cluster. During the observation process, many completed pods appeared, and the tasks got stuck. I'm wondering if this has something to do with cluster resources or nextflow, and what else could happen if this pod keeps failing to recycle?

答案1

得分: 1

需要更多的细节,但如果您的 POD 已经完成,这意味着代码已经正确执行。

无论如何,如果您的 POD 抛出 错误失败,它应该会 崩溃 而不是更改状态为 已完成

如果在K8s中有资源问题,POD将根本无法调度,并将停留在 挂起 状态。

另一种情况可能是POD开始 崩溃 或出现 OOM(内存不足)杀死 事件,因此在这种情况下,您必须不断检查POD的状态。

英文:

Need more details however if your POD are being completed which means code is being executed properly.

In any case, if your POD throwing Error or Failed it should crash and not change status to Completed.

If there is a resource issue in K8s POD won't schedule at all and will stuck in a pending state.

Another case could be POD start crashing or get OOM kill event so in this case you have to keep checking status of POD.

huangapple
  • 本文由 发表于 2023年7月6日 10:41:26
  • 转载请务必保留本文链接:https://go.coder-hub.com/76625172.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定