检查 KAFKA 在我的情况下的健康状况

huangapple go评论60阅读模式
英文:

How to check the KAFKA health in my situation

问题

我们的应用程序构建在K8S集群之上,用于管理大量设备。其中一个功能是,如果任何设备离线,将触发通知并发送电子邮件给用户。

但是,自从两周前起,此功能不再起作用,电子邮件未发送。

而且,由于之前负责kafka部分的人已经离开了公司,我需要自己弄清楚kafka服务是否仍然正常运行。

我对kafka相当陌生,当我登录K8S的主节点时,我看到了与kafka相关的一些Pod。

在Kafka服务上进行任何类型的健康检查应该从何处开始?

英文:

Our application is built on top of K8S Cluster, that is to manage large amount of devices. One of the feature is, if any device becomes offline, a notification would be triggered and an email would be sent out to the user.

But, since 2 weeks ago, such feature no longer works, and emails are not sent.

And since the guy used to work on the kafka part has left the company, I need to figure out if kafka service is still working health myself.

I am quite new to kafka, when I login the master node on k8s, I have see such pods that kafka related.

k8s -n kafka get pods
NAME                              READY   STATUS    RESTARTS   AGE
kafka-0                           1/1     Running   779        466d
kafka-1                           1/1     Running   848        2y266d
kafka-2                           1/1     Running   797        466d
kafka-exporter-58c76747c6-j2cv2   1/1     Running   306        292d
kafka-manager-844657d4bf-q2j7n    1/1     Running   1647       2y215d
kafka-zookeeper-0                 1/1     Running   574        466d
kafka-zookeeper-1                 1/1     Running   573        2y266d
kafka-zookeeper-2                 1/1     Running   569        466d

Where should I start any type of sanity checking on Kafka service

答案1

得分: 1

你有kafka-exporterkafka-manager服务。

这两者都有通过HTTP请求检查Kafka健康状态的能力,可以访问exporter的指标端点(或者Prometheus / Grafana,如果它们也存在于您的集群中),或者访问kafka-manager UI,而不是直接访问Kafka本身,它们不特定于kubectl命令。

更多信息

或者,您可以考虑迁移到https://strimzi.io的安装,它带有基于Prometheus的自己的监控堆栈。

英文:

You have both kafka-exporter and kafka-manager services.

Both of these have their own ability to check the health of Kafka via HTTP requests, either to the exporter's metrics endpoint (or Prometheus / Grafana, if these also exist in your cluster), or to the kafka-manager UI, rather than Kafka itself, and are not specific to kubectl commands.

More information

Alternatively, you could migrate to an installation of https://strimzi.io , which comes with its own monitoring stack based on Prometheus.

答案2

得分: 0

你可以使用以下命令来查看Pod日志:

$ kubectl logs -n kafka [POD_NAME]

并且你可以使用以下命令来查看Pod的资源使用情况:

$ kubectl describe pod [POD_NAME] -n kafka

这两个命令可以帮助你。

英文:

You can check pod logs with this command:

$ kubectl logs -n kafka [POD_NAME]

and you can check pod resource usage you can use this:

$ kubectl describe pod [POD_NAME] -n kafka

These two commands can help you.

答案3

得分: 0

如果检查以 k8s 的惯例方式完成,那么会使用活跃探针(liveness probe):

执行:

kubectl get pod <pod 名称> -o yaml

并搜索 livenessProbe

英文:

If the check is done in an k8s idiomatic way then an liveness probe is used:

Execute:

kubectl get pod &lt;pod name&gt; -o yaml

and search for livenessProbe.

huangapple
  • 本文由 发表于 2023年3月31日 02:27:13
  • 转载请务必保留本文链接:https://go.coder-hub.com/75891754.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定