Prometheus doesn't have metrics from taskmanager if flink job started

huangapple go评论46阅读模式
英文:

Prometheus doesn't have metrics from taskmanager if flink job started

问题

我在Kubernetes上运行Flink 1.15.2,并为Flink集群设置了以下指标配置:

```yaml
# 指标配置
metrics.reporters: prom
metrics.reporter.prom.class: org.apache.flink.metrics.prometheus.PrometheusReporter

问题是,如果Flink作业已启动,Prometheus就无法获取来自TaskManager的指标。如果我停止作业,然后可以看到指标,但某些指标是空的。

  1. 我尝试减少CPU使用率,但仍然无法从TaskManager获取指标。
  2. 我尝试增加任务槽位,仍然没有指标。
  3. 这发生在Intel和ARM节点上都出现。
  4. 我尝试更改Flink配置如下,指标在一段时间内(几秒钟)被收集,然后再次消失。
# 指标配置
metrics.reporters: prom
metrics.reporter.prom.class: org.apache.flink.metrics.prometheus.PrometheusReporterFactory
  1. 我尝试更改Flink配置如下,但仍然没有指标。
kafkaSourceBuilder.setProperty("register.consumer.metrics", "false");
var producerProperties = new Properties();
producerProperties.setProperty("register.producer.metrics", "false");
producerSinkBuilder.setKafkaProducerConfig(producerProperties);
  1. 如果我尝试在Flink 1.15.3上启动作业,指标会被收集。
  2. 如果我尝试在Flink 1.16.0上启动作业,Prometheus根本没有来自Flink的任何指标。

<details>
<summary>英文:</summary>

I operate flink 1.15.2 on Kubernetes and set metric configuration for Flink Cluster as below

metrics

metrics.reporters: prom
metrics.reporter.prom.class: org.apache.flink.metrics.prometheus.PrometheusReporter


The problem is that prometheus doesn&#39;t get metrics from taskmanager if the flink job has started.
If I stopped the job, then I could see the metrics however some metrics are empty.



1. I tried to reduce CPU usage but still no metric from taskmanager
2. I tried to  increase task slot, still no metric
3. It happens to both Intel and ARM node
4. I tried to change flink config as below, metircs were collocted for a moment(several seconds) and disappeared again

metrics

metrics.reporters: prom
metrics.reporter.prom.class: org.apache.flink.metrics.prometheus.PrometheusReporterFactory

5. I tried to change flink config as below, but still no metric

kafkaSourceBuilder.setProperty("register.consumer.metrics", "false");
var producerProperties = new Properties();
producerProperties.setProperty("register.producer.metrics", "false");
producerSinkBuilder.setKafkaProducerConfig(producerProperties);

6. If I try to start job on flink 1.15.3, metircs were collocted 
6. If I try to start job on flink 1.16.0,  Prometheus doesn&#39;t have any metric from flink at all

</details>


# 答案1
**得分**: 2

如Flink 1.16的发布说明中所提到的,通过它们的类来配置报告者已被弃用。详细信息请参见https://nightlies.apache.org/flink/flink-docs-master/release-notes/flink-1.16/#flink-27206httpsissuesapacheorgjirabrowseflink-27206。

在1.16.0版本中还存在一些已知的指标报告问题;请升级到Flink 1.16.1。

<details>
<summary>英文:</summary>

As mentioned in the release notes of Flink 1.16, configuring reporters by their class has been deprecated. See https://nightlies.apache.org/flink/flink-docs-master/release-notes/flink-1.16/#flink-27206httpsissuesapacheorgjirabrowseflink-27206 for details. 

There are also some known issues with metrics reporting in 1.16.0; please upgrade to Flink 1.16.1.

</details>



huangapple
  • 本文由 发表于 2023年2月8日 19:11:25
  • 转载请务必保留本文链接:https://go.coder-hub.com/75384934.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定