2023年2月8日 20:42:50go评论123阅读模式

英文:

When using "Kubectl Drain Node" node pods, it doesn't wait for new pods to get healthy so that old ones die later

问题

当我在部署中执行图像更新，例如从版本1.0.0更改为2.0.0，并使用以下设置时：

strategy:
    rollingUpdate:
      maxSurge: 25%
      maxUnavailable: 0
    type: RollingUpdate
  selector:
    matchLabels:
      app: platform-menu-backend

旧版本1.0.0 一直准备好，而版本2.0.0 不准备好。当版本2.0.0 准备好时，版本1.0.0 会停止运行，因此应用程序不会中断。

问题在于当我使用 "kubectl drain node" 命令时。它会在已排空的节点上重新创建Pod，但它不会等待新的Pod准备好，而是在重新创建新的Pod后立即终止旧的Pod。因此应用程序会中断。

如何让Kubernetes等待新的Pod准备就绪，然后再终止旧的Pod？

英文:

When I do an image update in my deployment, changing from version 1.0.0 to 2.0.0 for example, with these settings:

strategy:
    rollingUpdate:
      maxSurge: 25%
      maxUnavailable: 0
    type: RollingUpdate
  selector:
    matchLabels:
      app: platform-menu-backend

Old version 1.0.0 is ready all the time while version 2.0.0 is NOT ready. When version 2.0.0 is ready, version 1.0.0 dies, so I don't have downtime in the application.

The problem is when I use the "kubectl drain node" command. It recreates the pods that are on the drained node in another node
healthy one, but it doesn't wait for the new pods to be ready and it already kills the old pod as soon as it recreates the new pod. So I have downtime in the application.

How can I make Kubernetes wait for the new pod to be healthy and only then kill the old pod?

答案1

得分: 2

为了在Kubernetes集群中排空节点时避免直接影响您的工作负载，您可以为部署创建一个PodDisruptionBudget（PDB）。通过在PDB中设置minAvailable或maxUnavailable，如果排空命令违反了这些约束，它将失败。有关更多信息，请查看Kubernetes文档上的PDB：https://kubernetes.io/docs/tasks/run-application/configure-pdb。

另一个选择是在滚动更新或重新启动部署之前，将目标节点标记为不可调度。这将导致您的Pod被调度到其他可用节点上，使您能够安全地排空原始节点。然而，这不是确保您的应用程序高可用性的首选解决方案。

为了获得最佳性能，建议增加集群中节点的数量以及工作负载的副本数。这将确保即使某个节点被排空，您的应用程序仍将在另一个节点上运行。

英文:

To avoid directly impacting your workloads when draining a node in a Kubernetes cluster, you can create a PodDisruptionBudget (PDB) for your deployment. By setting minAvailable or maxUnavailable in your PDB, the drain command will fail if it would violate these constraints. For more information, check out the Kubernetes documentation on PDBs: https://kubernetes.io/docs/tasks/run-application/configure-pdb.

Another option is to make the target node unschedulable before rolling out or restarting your deployment. This will cause your pods to be scheduled to other available nodes, allowing you to drain the original node safely. However, this is not the preferred solution for ensuring high availability of your applications.

For optimal performance, it is recommended to increase the number of nodes in your cluster and the number of replicas for your workloads. This will ensure that even if a node is drained, your application will still be up and running on another node.

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

When using "Kubectl Drain Node" node pods, it doesn't wait for new pods to get healthy so that old ones die later

问题

答案1

AKHQ: 无法在您的配置文件中找到任何集群，请确保配置文件已正确加载

如何从Kubernetes配置中获取节点IP？

如何将环境变量传递给在Kubernetes中作为卷访问的ConfigMap中的脚本

如何在Golang中使用Kubernetes服务账户？

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论