k8s,没有关于CPU和内存的信息

huangapple go评论65阅读模式
英文:

k8s, without information about CPU and Memory

问题

我在使用 iguazio/mlrun 解决方案中的 igztop 检查运行中的 pod 时,发现 CPU 和内存的值为空。请参见此 pod 的输出中的第一行 *m6vd9

[ jist @ iguazio-system 07:41:43 ]->(0) ~ $ igztop -s cpu
+--------------------------------------------------------------+--------+------------+-----------+---------+-------------+-------------+
| NAME                                                         | CPU(m) | MEMORY(Mi) | NODE      | STATUS  | MLRun Proj. | MLRun Owner |
+--------------------------------------------------------------+--------+------------+-----------+---------+-------------+-------------+
| xxxxxxxxxxxxxxxx7445dfc774-m6vd9                             |        |            | k8s-node3 | Running |             |             |
| xxxxxx-jupyter-55b565cc78-7bjfn                              | 27     | 480        | k8s-node1 | Running |             |             |
| nuclio-xxxxxxxxxxxxxxxxxxxxxxxxxx-756fcb7f74-h6ttk           | 15     | 246        | k8s-node3 | Running |             |             |
| mlrun-db-7bc6bcf796-64nz7                                    | 13     | 717        | k8s-node2 | Running |             |             |
| xxxx-jupyter-c4cccdbd8-slhlx                                 | 10     | 79         | k8s-node1 | Running |             |             |
| v3io-webapi-scj4h                                            | 8      | 1817       | k8s-node2 | Running |             |             |
| v3io-webapi-56g4d                                            | 8      | 1827       | k8s-node1 | Running |             |             |
| spark-worker-8d877878c-ts2t7                                 | 8      | 431        | k8s-node1 | Running |             |             |
| provazio-controller-644f5784bf-htcdk                         | 8      | 34         | k8s-node1 | Running |             |             |
+--------------------------------------------------------------+--------+------------+-----------+---------+-------------+-------------+

并且在 Grafana 中也无法查看此 pod 的性能指标(CPU、内存、I/O)。

你知道如何解决此问题,而不需要重启整个节点吗?这个问题的根本原因是什么?

英文:

I got empty values for CPU and Memory, when I used igztop for check running pods in iguazio/mlrun solution. See the first line in output for this pod *m6vd9:

[ jist @ iguazio-system 07:41:43 ]->(0) ~ $ igztop -s cpu
+--------------------------------------------------------------+--------+------------+-----------+---------+-------------+-------------+
| NAME                                                         | CPU(m) | MEMORY(Mi) | NODE      | STATUS  | MLRun Proj. | MLRun Owner |
+--------------------------------------------------------------+--------+------------+-----------+---------+-------------+-------------+
| xxxxxxxxxxxxxxxx7445dfc774-m6vd9                             |        |            | k8s-node3 | Running |             |             |
| xxxxxx-jupyter-55b565cc78-7bjfn                              | 27     | 480        | k8s-node1 | Running |             |             |
| nuclio-xxxxxxxxxxxxxxxxxxxxxxxxxx-756fcb7f74-h6ttk           | 15     | 246        | k8s-node3 | Running |             |             |
| mlrun-db-7bc6bcf796-64nz7                                    | 13     | 717        | k8s-node2 | Running |             |             |
| xxxx-jupyter-c4cccdbd8-slhlx                                 | 10     | 79         | k8s-node1 | Running |             |             |
| v3io-webapi-scj4h                                            | 8      | 1817       | k8s-node2 | Running |             |             |
| v3io-webapi-56g4d                                            | 8      | 1827       | k8s-node1 | Running |             |             |
| spark-worker-8d877878c-ts2t7                                 | 8      | 431        | k8s-node1 | Running |             |             |
| provazio-controller-644f5784bf-htcdk                         | 8      | 34         | k8s-node1 | Running |             |             |

and It also was not possible to see performance metrics (CPU, Memory, I/O) for this pod in Grafana.

Do you know, how can I resolve this issue without whole node restart (and what is the root cause)?

答案1

得分: 1

以下故障排除步骤将帮助您解决问题:

  1. 使用描述命令检查是否可以查看 Pod 的 CPU 和内存:

    kubectl describe pods my-pod
    
  2. 使用以下命令检查是否可以查看所有 Pod 和节点的 CPU 和内存:

    kubectl top pod
    kubectl top node
    
  3. 使用以下命令检查度量服务器是否正在运行:

    kubectl get apiservices v1beta1.metrics.k8s.io
    kubectl get pod -n kube-system -l k8s-app=metrics-server
    
  4. 使用以下查询检查 Pod 的 CPU 和内存:

    每个 Pod 的 CPU 利用率:

    sum(irate(container_cpu_usage_seconds_total{container!="POD", container=~".+"}[2m])) by (pod)
    

    每个 Pod 的 RAM 使用情况:

    sum(container_memory_usage_bytes{container!="POD", container=~".+"}) by (pod)
    
  5. 如果发现任何错误,请检查 Pod 和节点的日志,并附上这些日志以便进一步排除故障。

英文:

Below troubleshooting steps will help you in resolving the issue:

1.Check if you can see the CPU and memory of the pod using describe command:

kubectl describe pods my-pod

2.Check if you can view CPU and memory of all pods and nodes using below commands:

kubectl top pod 

kubectl top node

3.Check if the metric server is running by using below command:

kubectl get apiservices v1beta1.metrics.k8s.io
kubectl get pod -n kube-system -l k8s-app=metrics-server

4.Check the CPU and memory of the pod using below queries:

> CPU Utilisation Per Pod:
>
> sum(irate(container_cpu_usage_seconds_total{container!="POD", container=~".+"}[2m])) by (pod)
>
> RAM Usage Per Pod:
>
> sum(container_memory_usage_bytes{container!="POD", container=~".+"}) by (pod)

5.Check logs of the pod and node, if you find any error attach those logs for further troubleshooting.

答案2

得分: 0

似乎是与 kubelet 相关的问题,最好的方法是按照下面的逐步场景进行操作(请参阅 pdf 中的图表):

k8s,没有关于CPU和内存的信息
k8s,没有关于CPU和内存的信息

英文:

It seems as the issue with kubelet, the best is to follow the next step by step scenario (see diagram in pdf)

k8s,没有关于CPU和内存的信息
k8s,没有关于CPU和内存的信息

huangapple
  • 本文由 发表于 2023年6月22日 15:59:23
  • 转载请务必保留本文链接:https://go.coder-hub.com/76529714.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定