k8s,没有关于CPU和内存的信息

huangapple go评论103阅读模式
英文:

k8s, without information about CPU and Memory

问题

我在使用 iguazio/mlrun 解决方案中的 igztop 检查运行中的 pod 时,发现 CPU 和内存的值为空。请参见此 pod 的输出中的第一行 *m6vd9

  1. [ jist @ iguazio-system 07:41:43 ]->(0) ~ $ igztop -s cpu
  2. +--------------------------------------------------------------+--------+------------+-----------+---------+-------------+-------------+
  3. | NAME | CPU(m) | MEMORY(Mi) | NODE | STATUS | MLRun Proj. | MLRun Owner |
  4. +--------------------------------------------------------------+--------+------------+-----------+---------+-------------+-------------+
  5. | xxxxxxxxxxxxxxxx7445dfc774-m6vd9 | | | k8s-node3 | Running | | |
  6. | xxxxxx-jupyter-55b565cc78-7bjfn | 27 | 480 | k8s-node1 | Running | | |
  7. | nuclio-xxxxxxxxxxxxxxxxxxxxxxxxxx-756fcb7f74-h6ttk | 15 | 246 | k8s-node3 | Running | | |
  8. | mlrun-db-7bc6bcf796-64nz7 | 13 | 717 | k8s-node2 | Running | | |
  9. | xxxx-jupyter-c4cccdbd8-slhlx | 10 | 79 | k8s-node1 | Running | | |
  10. | v3io-webapi-scj4h | 8 | 1817 | k8s-node2 | Running | | |
  11. | v3io-webapi-56g4d | 8 | 1827 | k8s-node1 | Running | | |
  12. | spark-worker-8d877878c-ts2t7 | 8 | 431 | k8s-node1 | Running | | |
  13. | provazio-controller-644f5784bf-htcdk | 8 | 34 | k8s-node1 | Running | | |
  14. +--------------------------------------------------------------+--------+------------+-----------+---------+-------------+-------------+

并且在 Grafana 中也无法查看此 pod 的性能指标(CPU、内存、I/O)。

你知道如何解决此问题,而不需要重启整个节点吗?这个问题的根本原因是什么?

英文:

I got empty values for CPU and Memory, when I used igztop for check running pods in iguazio/mlrun solution. See the first line in output for this pod *m6vd9:

  1. [ jist @ iguazio-system 07:41:43 ]->(0) ~ $ igztop -s cpu
  2. +--------------------------------------------------------------+--------+------------+-----------+---------+-------------+-------------+
  3. | NAME | CPU(m) | MEMORY(Mi) | NODE | STATUS | MLRun Proj. | MLRun Owner |
  4. +--------------------------------------------------------------+--------+------------+-----------+---------+-------------+-------------+
  5. | xxxxxxxxxxxxxxxx7445dfc774-m6vd9 | | | k8s-node3 | Running | | |
  6. | xxxxxx-jupyter-55b565cc78-7bjfn | 27 | 480 | k8s-node1 | Running | | |
  7. | nuclio-xxxxxxxxxxxxxxxxxxxxxxxxxx-756fcb7f74-h6ttk | 15 | 246 | k8s-node3 | Running | | |
  8. | mlrun-db-7bc6bcf796-64nz7 | 13 | 717 | k8s-node2 | Running | | |
  9. | xxxx-jupyter-c4cccdbd8-slhlx | 10 | 79 | k8s-node1 | Running | | |
  10. | v3io-webapi-scj4h | 8 | 1817 | k8s-node2 | Running | | |
  11. | v3io-webapi-56g4d | 8 | 1827 | k8s-node1 | Running | | |
  12. | spark-worker-8d877878c-ts2t7 | 8 | 431 | k8s-node1 | Running | | |
  13. | provazio-controller-644f5784bf-htcdk | 8 | 34 | k8s-node1 | Running | | |

and It also was not possible to see performance metrics (CPU, Memory, I/O) for this pod in Grafana.

Do you know, how can I resolve this issue without whole node restart (and what is the root cause)?

答案1

得分: 1

以下故障排除步骤将帮助您解决问题:

  1. 使用描述命令检查是否可以查看 Pod 的 CPU 和内存:

    1. kubectl describe pods my-pod
  2. 使用以下命令检查是否可以查看所有 Pod 和节点的 CPU 和内存:

    1. kubectl top pod
    2. kubectl top node
  3. 使用以下命令检查度量服务器是否正在运行:

    1. kubectl get apiservices v1beta1.metrics.k8s.io
    2. kubectl get pod -n kube-system -l k8s-app=metrics-server
  4. 使用以下查询检查 Pod 的 CPU 和内存:

    每个 Pod 的 CPU 利用率:

    1. sum(irate(container_cpu_usage_seconds_total{container!="POD", container=~".+"}[2m])) by (pod)

    每个 Pod 的 RAM 使用情况:

    1. sum(container_memory_usage_bytes{container!="POD", container=~".+"}) by (pod)
  5. 如果发现任何错误,请检查 Pod 和节点的日志,并附上这些日志以便进一步排除故障。

英文:

Below troubleshooting steps will help you in resolving the issue:

1.Check if you can see the CPU and memory of the pod using describe command:

  1. kubectl describe pods my-pod

2.Check if you can view CPU and memory of all pods and nodes using below commands:

  1. kubectl top pod
  2. kubectl top node

3.Check if the metric server is running by using below command:

  1. kubectl get apiservices v1beta1.metrics.k8s.io
  2. kubectl get pod -n kube-system -l k8s-app=metrics-server

4.Check the CPU and memory of the pod using below queries:

> CPU Utilisation Per Pod:
>
> sum(irate(container_cpu_usage_seconds_total{container!="POD", container=~".+"}[2m])) by (pod)
>
> RAM Usage Per Pod:
>
> sum(container_memory_usage_bytes{container!="POD", container=~".+"}) by (pod)

5.Check logs of the pod and node, if you find any error attach those logs for further troubleshooting.

答案2

得分: 0

似乎是与 kubelet 相关的问题,最好的方法是按照下面的逐步场景进行操作(请参阅 pdf 中的图表):

k8s,没有关于CPU和内存的信息
k8s,没有关于CPU和内存的信息

英文:

It seems as the issue with kubelet, the best is to follow the next step by step scenario (see diagram in pdf)

k8s,没有关于CPU和内存的信息
k8s,没有关于CPU和内存的信息

huangapple
  • 本文由 发表于 2023年6月22日 15:59:23
  • 转载请务必保留本文链接:https://go.coder-hub.com/76529714.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定