Prometheus告警规则未显示

huangapple go评论94阅读模式
英文:

Prometheus Alert rule not showing up

问题

我创建了 alert 规则文件 file.yml 并将其复制到 prometheus 容器的 "/" 下,并将其添加到 values.yml 文件中,但我无法在 UI 中看到规则。

  1. prometheus.yml:
  2. rule_files:
  3. - /etc/config/recording_rules.yml
  4. - /etc/config/alerting_rules.yml
  5. - /custom_alerting_rules.yml
  6. ## 下面两个文件已被弃用,将从默认值文件中删除
  7. - /etc/config/rules
  8. - /etc/config/alerts
  9. alerting:
  10. alertmanagers:
  11. - static_configs:
  12. - targets: ['alertmanager:9093'] 这里我尝试了 alert manager 服务的 @IP

这是 alert 文件:

  1. groups:
  2. - name: my-custom-alerts
  3. rules:
  4. - alert: HighPodCount
  5. expr: count(kube_pod_info{pod=~"consumer.*"}) > 2
  6. for: 5m
  7. labels:
  8. severity: critical
  9. annotations:
  10. summary: High pod count
  11. description: The number of pods is above the threshold.

运行 k get svc 显示:

  1. prometheus-alertmanager ClusterIP 10.10x.21x.x8 <none> 9093/TCP 68m

我做错了什么?

英文:

I created alert rule file.yml and copy it under "/" of prometheus contianer and I added it to the values.yml file but I can't see the rule in the UI

  1. prometheus.yml:
  2. rule_files:
  3. - /etc/config/recording_rules.yml
  4. - /etc/config/alerting_rules.yml
  5. - /custom_alerting_rules.yml
  6. ## Below two files are DEPRECATED will be removed from this default values file
  7. - /etc/config/rules
  8. - /etc/config/alerts
  9. alerting:
  10. alertmanagers:
  11. - static_configs:
  12. - targets: [&#39;alertmanager:9093&#39;] here i tried the @IP of alert manager service

here is the alert file

  1. groups:
  2. - name: my-custom-alerts
  3. rules:
  4. - alert: HighPodCount
  5. expr: count(kube_pod_info{pod=~&quot;consumer.*&quot;}) &gt; 2
  6. for: 5m
  7. labels:
  8. severity: critical
  9. annotations:
  10. summary: High pod count
  11. description: The number of pods is above the threshold.

k get svc shows

  1. prometheus-alertmanager ClusterIP 10.10x.21x.x8 &lt;none&gt; 9093/TCP 68m

what am doing wrong ?

答案1

得分: 1

从 Prometheus 服务器的角度来看,其配置和规则文件不会自动重新加载。您需要手动调用 Reload 管理 API。

有三种方法可以实现此目的:

  1. 如果您在 Prometheus 容器中:

    1. curl -XPOST http://:::9090/-/reload
  2. 如果您在同一命名空间的另一个 Pod 容器中:

    1. # prometheus pod 名称: prometheus-0
    2. curl -XPOST http://prometheus-0:9090/-/reload
  3. 如果您不在 K8s 集群内(即您运行 kubectl 命令的地方):

    1. # prometheus pod 名称: prometheus-0, 命名空间: monitoring
    2. kubectl port-forward prometheus-0 -n monitoring 9090:9090
    3. # 在另一个 shell 中
    4. curl -XPOST http://127.0.0.1:9090/-/reload
英文:

From the perspective of Prometheus server, its configuration and rule files are not auto-reloaded. You need to call the Reload management api manually.

There are 3 ways for this purpose:

  1. If you're in the prometheus contianer:
    1. curl -XPOST http://:::9090/-/reload
  2. If you're in another pod container of the same namespace:
    1. # prometheus pod name: prometheus-0
    2. curl -XPOST http://prometheus-0:9090/-/reload
  3. If you're out of K8s cluster(i.e. where you run kubectl command):
    1. # prometheus pod name: prometheus-0, namespace: monitoring
    2. kubectl port-forward prometheus-0 -nmonitoring 9090:9090
    3. # in another shell
    4. curl -XPOST http://127.0.0.1:9090/-/reload

答案2

得分: 0

如果您使用了 prometheus operator,您可以选择使用 PrometheusRule CRD,而不是一个文件:

  1. apiVersion: monitoring.coreos.com/v1
  2. kind: PrometheusRule
  3. metadata:
  4. name: my-custom-alert
  5. namespace: my-namespace
  6. spec:
  7. groups:
  8. - name: my-custom-alerts
  9. rules:
  10. - alert: HighPodCount
  11. expr: count(kube_pod_info{pod=~"consumer.*"}) > 2
  12. for: 5m
  13. labels:
  14. severity: critical
  15. annotations:
  16. summary: High pod count
  17. description: The number of pods is above the threshold.

然后:

  1. #> kubectl get PrometheusRule -n my-namespace
  2. my-custom-alert

这将使操作员在Prometheus的配置中挂载警报,而无需您执行其他操作。

英文:

If you used prometheus operator you could alternatively use PrometheusRule crd rather than a file :

  1. apiVersion: monitoring.coreos.com/v1
  2. kind: PrometheusRule
  3. metadata:
  4. name: my-custom-alert
  5. namespace: my-namespace
  6. spec:
  7. groups:
  8. - name: my-custom-alerts
  9. rules:
  10. - alert: HighPodCount
  11. expr: count(kube_pod_info{pod=~&quot;consumer.*&quot;}) &gt; 2
  12. for: 5m
  13. labels:
  14. severity: critical
  15. annotations:
  16. summary: High pod count
  17. description: The number of pods is above the threshold.

Then :

  1. #&gt; kubectl get PrometheusRule -n my-namespace
  2. my-custom-alert

This would have the operator mount the alert within prometheus' config without you having to do anything else.

huangapple
  • 本文由 发表于 2023年7月10日 21:54:46
  • 转载请务必保留本文链接:https://go.coder-hub.com/76654464.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定