Pod在EKS集群中创建失败,出现FailedScheduling错误。

huangapple go评论126阅读模式
英文:

Pod creation in EKS cluster fails with FailedScheduling error

问题

我已经在一个公共子网中创建了一个新的 EKS 集群,并有 1 个工作节点。我能够查询节点、连接到集群并运行 pod 创建命令,然而,当我尝试创建一个 pod 时,它失败,并显示了下面描述 pod 时收到的错误。请指导。

Tolerations:                 node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                             node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type     Reason            Age   From               Message
  ----     ------            ----  ----               -------
  Warning  FailedScheduling  81s   default-scheduler  0/1 nodes are available: 1 Too many pods. preemption: 0/1 nodes are available: 1 No preemption victims found for incoming pod.
  Warning  FailedScheduling  16m                 default-scheduler  0/2 nodes are available: 2 Too many pods, 2 node(s) had untolerated taint {node.kubernetes.io/unschedulable: }, 2 node(s) were unschedulable. preemption: 0/2 nodes are available: 2 Preemption is not helpful for scheduling.
  Warning  FailedScheduling  16m                 default-scheduler  0/3 nodes are available: 2 node(s) had untolerated taint {node.kubernetes.io/unschedulable: }, 2 node(s) were unschedulable, 3 Too many pods. preemption: 0/3 nodes are available: 1 No preemption victims found for incoming pod, 2 Preemption is not helpful for scheduling.
  Warning  FailedScheduling  14m (x3 over 22m)   default-scheduler  0/2 nodes are available: 1 node(s) had untolerated taint {node.kubernetes.io/unschedulable: }, 1 node(s) were unschedulable, 2 Too many pods. preemption: 0/2 nodes are available: 1 No preemption victims found for incoming pod, 1 Preemption is not helpful for scheduling.
  Warning  FailedScheduling  12m                 default-scheduler  0/2 nodes are available: 1 Too many pods, 2 node(s) had untolerated taint {node.kubernetes.io/unschedulable: }, 2 node(s) were unschedulable. preemption: 0/2 nodes are available: 2 Preemption is not helpful for scheduling.
  Warning  FailedScheduling  7m14s               default-scheduler  no nodes available to schedule pods
  Warning  FailedScheduling  105s (x5 over 35m)  default-scheduler  0/1 nodes are available: 1 Too many pods. preemption: 0/1 nodes are available: 1 No preemption victims found for incoming pod.

我能够获取节点的状态,看起来是就绪的:

kubectl get nodes  
NAME                         STATUS   ROLES    AGE   VERSION
ip-10-0-12-61.ec2.internal   Ready    <none>   15m   v1.24.7-eks-fb459a0

在故障排除时,我尝试了以下选项:

  1. 重新创建完整的演示集群 - 仍然出现相同的错误。
  2. 尝试使用不同的镜像重新创建 pod - 仍然出现相同的错误。
  3. 尝试将实例类型增加到 t3.micro - 仍然出现相同的错误。
  4. 回顾了集群中的安全组和其他参数 - 未能找到根本原因。
英文:

I have created a new EKS cluster with 1 worker node in a public subnet. I am able to query node, connect to the cluster, and run pod creation command, however, when I am trying to create a pod it fails with the below error got by describing the pod. Please guide.

    Tolerations:                 node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                                 node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
    Events:
      Type     Reason            Age   From               Message
      ----     ------            ----  ----               -------
      Warning  FailedScheduling  81s   default-scheduler  0/1 nodes are available: 1 Too many pods. preemption: 0/1 nodes are available: 1 No preemption victims found for incoming pod.
  Warning  FailedScheduling  16m                 default-scheduler  0/2 nodes are available: 2 Too many pods, 2 node(s) had untolerated taint {node.kubernetes.io/unschedulable: }, 2 node(s) were unschedulable. preemption: 0/2 nodes are available: 2 Preemption is not helpful for scheduling.
  Warning  FailedScheduling  16m                 default-scheduler  0/3 nodes are available: 2 node(s) had untolerated taint {node.kubernetes.io/unschedulable: }, 2 node(s) were unschedulable, 3 Too many pods. preemption: 0/3 nodes are available: 1 No preemption victims found for incoming pod, 2 Preemption is not helpful for scheduling.
  Warning  FailedScheduling  14m (x3 over 22m)   default-scheduler  0/2 nodes are available: 1 node(s) had untolerated taint {node.kubernetes.io/unschedulable: }, 1 node(s) were unschedulable, 2 Too many pods. preemption: 0/2 nodes are available: 1 No preemption victims found for incoming pod, 1 Preemption is not helpful for scheduling.
  Warning  FailedScheduling  12m                 default-scheduler  0/2 nodes are available: 1 Too many pods, 2 node(s) had untolerated taint {node.kubernetes.io/unschedulable: }, 2 node(s) were unschedulable. preemption: 0/2 nodes are available: 2 Preemption is not helpful for scheduling.
  Warning  FailedScheduling  7m14s               default-scheduler  no nodes available to schedule pods
  Warning  FailedScheduling  105s (x5 over 35m)  default-scheduler  0/1 nodes are available: 1 Too many pods. preemption: 0/1 nodes are available: 1 No preemption victims found for incoming pod.

I am able to get status of the node and it looks ready:

kubectl get nodes  
NAME                         STATUS   ROLES    AGE   VERSION
ip-10-0-12-61.ec2.internal   Ready    <none>   15m   v1.24.7-eks-fb459a0

While troubleshooting I tried below options:

  1. recreate the complete demo cluster - still the same error
  2. try recreating pods with different images - still the same error
  3. trying to increase to instance type to t3.micro - still the same error
  4. reviewed security groups and other parameters in a cluster - Couldnt come to RCA

答案1

得分: 7

"It's due to the node's POD limit or IP limit on Nodes.

So if we see official Amazon doc, t3.micro maximum 2 interface you can use and 2 private IP. Roughly you might be getting around 4 IPs to use and 1st IP get used by Node etc, There will be also default system PODs running as Daemon set and so.

Add new instance or upgrade to larger instance who can handle more pods.

英文:

it's due to the node's POD limit or IP limit on Nodes.

So if we see official Amazon doc, t3.micro maximum 2 interface you can use and 2 private IP. Roughly you might be getting around 4 IPs to use and 1st IP get used by Node etc, There will be also default system PODs running as Daemon set and so.

Add new instance or upgrade to larger instance who can handle more pods.

huangapple
  • 本文由 发表于 2023年1月6日 12:29:55
  • 转载请务必保留本文链接:https://go.coder-hub.com/75026935.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定