英文:
When using "Kubectl Drain Node" node pods, it doesn't wait for new pods to get healthy so that old ones die later
问题
当我在部署中执行图像更新,例如从版本1.0.0更改为2.0.0,并使用以下设置时:
strategy:
rollingUpdate:
maxSurge: 25%
maxUnavailable: 0
type: RollingUpdate
selector:
matchLabels:
app: platform-menu-backend
旧版本1.0.0 一直准备好,而版本2.0.0 不准备好。当版本2.0.0 准备好时,版本1.0.0 会停止运行,因此应用程序不会中断。
问题在于当我使用 "kubectl drain node" 命令时。它会在已排空的节点上重新创建Pod,但它不会等待新的Pod准备好,而是在重新创建新的Pod后立即终止旧的Pod。因此应用程序会中断。
如何让Kubernetes等待新的Pod准备就绪,然后再终止旧的Pod?
英文:
When I do an image update in my deployment, changing from version 1.0.0 to 2.0.0 for example, with these settings:
strategy:
rollingUpdate:
maxSurge: 25%
maxUnavailable: 0
type: RollingUpdate
selector:
matchLabels:
app: platform-menu-backend
Old version 1.0.0 is ready all the time while version 2.0.0 is NOT ready. When version 2.0.0 is ready, version 1.0.0 dies, so I don't have downtime in the application.
The problem is when I use the "kubectl drain node" command. It recreates the pods that are on the drained node in another node
healthy one, but it doesn't wait for the new pods to be ready and it already kills the old pod as soon as it recreates the new pod. So I have downtime in the application.
How can I make Kubernetes wait for the new pod to be healthy and only then kill the old pod?
答案1
得分: 2
为了在Kubernetes集群中排空节点时避免直接影响您的工作负载,您可以为部署创建一个PodDisruptionBudget(PDB)。通过在PDB中设置minAvailable或maxUnavailable,如果排空命令违反了这些约束,它将失败。有关更多信息,请查看Kubernetes文档上的PDB:https://kubernetes.io/docs/tasks/run-application/configure-pdb。
另一个选择是在滚动更新或重新启动部署之前,将目标节点标记为不可调度。这将导致您的Pod被调度到其他可用节点上,使您能够安全地排空原始节点。然而,这不是确保您的应用程序高可用性的首选解决方案。
为了获得最佳性能,建议增加集群中节点的数量以及工作负载的副本数。这将确保即使某个节点被排空,您的应用程序仍将在另一个节点上运行。
英文:
To avoid directly impacting your workloads when draining a node in a Kubernetes cluster, you can create a PodDisruptionBudget (PDB) for your deployment. By setting minAvailable or maxUnavailable in your PDB, the drain command will fail if it would violate these constraints. For more information, check out the Kubernetes documentation on PDBs: https://kubernetes.io/docs/tasks/run-application/configure-pdb.
Another option is to make the target node unschedulable before rolling out or restarting your deployment. This will cause your pods to be scheduled to other available nodes, allowing you to drain the original node safely. However, this is not the preferred solution for ensuring high availability of your applications.
For optimal performance, it is recommended to increase the number of nodes in your cluster and the number of replicas for your workloads. This will ensure that even if a node is drained, your application will still be up and running on another node.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论