Prometheus告警规则未显示

huangapple go评论67阅读模式
英文:

Prometheus Alert rule not showing up

问题

我创建了 alert 规则文件 file.yml 并将其复制到 prometheus 容器的 "/" 下,并将其添加到 values.yml 文件中,但我无法在 UI 中看到规则。

prometheus.yml:
rule_files:
  - /etc/config/recording_rules.yml
  - /etc/config/alerting_rules.yml
  - /custom_alerting_rules.yml
## 下面两个文件已被弃用,将从默认值文件中删除
  - /etc/config/rules
  - /etc/config/alerts

alerting:
  alertmanagers:
    - static_configs:
        - targets: ['alertmanager:9093'] 这里我尝试了 alert manager 服务的 @IP

这是 alert 文件:

groups:
  - name: my-custom-alerts
    rules:
      - alert: HighPodCount
        expr: count(kube_pod_info{pod=~"consumer.*"}) > 2
        for: 5m
        labels:
          severity: critical
        annotations:
          summary: High pod count
          description: The number of pods is above the threshold.

运行 k get svc 显示:

prometheus-alertmanager ClusterIP 10.10x.21x.x8 <none> 9093/TCP 68m

我做错了什么?

英文:

I created alert rule file.yml and copy it under "/" of prometheus contianer and I added it to the values.yml file but I can't see the rule in the UI

    prometheus.yml:
    rule_files:
      - /etc/config/recording_rules.yml
      - /etc/config/alerting_rules.yml
      - /custom_alerting_rules.yml
    ## Below two files are DEPRECATED will be removed from this default values file
      - /etc/config/rules
      - /etc/config/alerts

    alerting:
      alertmanagers:
        - static_configs:
            - targets: [&#39;alertmanager:9093&#39;] here i tried the @IP of alert manager service 

here is the alert file

groups:
  - name: my-custom-alerts
    rules:
      - alert: HighPodCount
        expr: count(kube_pod_info{pod=~&quot;consumer.*&quot;}) &gt; 2
        for: 5m
        labels:
          severity: critical
        annotations:
          summary: High pod count
          description: The number of pods is above the threshold.

k get svc shows

prometheus-alertmanager               ClusterIP      10.10x.21x.x8    &lt;none&gt;        9093/TCP                       68m

what am doing wrong ?

答案1

得分: 1

从 Prometheus 服务器的角度来看,其配置和规则文件不会自动重新加载。您需要手动调用 Reload 管理 API。

有三种方法可以实现此目的:

  1. 如果您在 Prometheus 容器中:

    curl -XPOST http://:::9090/-/reload
    
  2. 如果您在同一命名空间的另一个 Pod 容器中:

    # prometheus pod 名称: prometheus-0
    curl -XPOST http://prometheus-0:9090/-/reload
    
  3. 如果您不在 K8s 集群内(即您运行 kubectl 命令的地方):

    # prometheus pod 名称: prometheus-0, 命名空间: monitoring
    kubectl port-forward prometheus-0 -n monitoring 9090:9090
    # 在另一个 shell 中
    curl -XPOST http://127.0.0.1:9090/-/reload
    
英文:

From the perspective of Prometheus server, its configuration and rule files are not auto-reloaded. You need to call the Reload management api manually.

There are 3 ways for this purpose:

  1. If you're in the prometheus contianer:
    curl -XPOST http://:::9090/-/reload
    
  2. If you're in another pod container of the same namespace:
    # prometheus pod name: prometheus-0
    curl -XPOST http://prometheus-0:9090/-/reload
    
  3. If you're out of K8s cluster(i.e. where you run kubectl command):
    # prometheus pod name: prometheus-0, namespace: monitoring
    kubectl port-forward prometheus-0 -nmonitoring 9090:9090
    # in another shell
    curl -XPOST http://127.0.0.1:9090/-/reload
    

答案2

得分: 0

如果您使用了 prometheus operator,您可以选择使用 PrometheusRule CRD,而不是一个文件:

apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
  name: my-custom-alert
  namespace: my-namespace
spec:
  groups:
  - name: my-custom-alerts
    rules:
      - alert: HighPodCount
        expr: count(kube_pod_info{pod=~"consumer.*"}) > 2
        for: 5m
        labels:
          severity: critical
        annotations:
          summary: High pod count
          description: The number of pods is above the threshold.

然后:

#> kubectl get PrometheusRule -n my-namespace
my-custom-alert

这将使操作员在Prometheus的配置中挂载警报,而无需您执行其他操作。

英文:

If you used prometheus operator you could alternatively use PrometheusRule crd rather than a file :

apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
  name: my-custom-alert
  namespace: my-namespace
spec:
  groups:
  - name: my-custom-alerts
    rules:
      - alert: HighPodCount
        expr: count(kube_pod_info{pod=~&quot;consumer.*&quot;}) &gt; 2
        for: 5m
        labels:
          severity: critical
        annotations:
          summary: High pod count
          description: The number of pods is above the threshold.

Then :

#&gt; kubectl get PrometheusRule -n my-namespace
my-custom-alert

This would have the operator mount the alert within prometheus' config without you having to do anything else.

huangapple
  • 本文由 发表于 2023年7月10日 21:54:46
  • 转载请务必保留本文链接:https://go.coder-hub.com/76654464.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定