Kafka Sink Connector的滞后一直显示为1,即使在处理所有记录后。

huangapple go评论61阅读模式
英文:

Kafka Sink Connector lags always show 1 even after processing all records

问题

I am new to Kafka connect.
我是Kafka连接的新手。

I am using a Kafka Sink Connector to write data to a datalake. The connector writes the data to a staging location and then moves it to the target location.
我正在使用Kafka Sink Connector将数据写入数据湖。连接器将数据写入临时位置,然后将其移动到目标位置。

However, when it processes all the data in the topic, and I check the lags in Kafka Manager, it always shows 1 if there is no data in the topic and it has processed all the records. It doesn’t go to 0.
然而,当它处理主题中的所有数据,并且我检查Kafka管理器中的滞后时,如果主题中没有数据并且它已经处理了所有记录,它总是显示为1,而不会变为0。

When there is data in the topic again, then lag becomes greater as expected and in sometime it comes down to 1 again after processing the records.
当主题中再次出现数据时,滞后会如预期地增加,并在处理记录后的一段时间内再次降至1。

Here is the config which I am using:
以下是我使用的配置:

"config": {
    "tasks.max": "1",
    "topics": "xxx",
    "adls.endpoint": "xxxx",
    "adls.container.name": "xxxx",
    "adls.auth.method": "ClientSecret",
    "adls.tenant.id": "xxxx",
    "adls.client.id": "xxxx",
    "adls.client.secret": "xxxxxxxxx",
    "base.directory": "xxxxxxxxxxxx",
    "rotation.filesize" : "1000000000",
    "rotation.inactivity" : "1800000",
    "rotation.record.count" : "100000",
    "auto.offset.reset":"earliest",
    "commit.rotated.only":false
  }

I would like to understand how to bring the lag to zero, as it’s triggering an alert for me when the lag remains at 1 for a longer period of time.
我想了解如何将滞后降至零,因为当滞后保持在1的时间较长时,它会触发警报。

I am not facing any data consistency issues, and I am receiving all the records. I would like to know what the best practice is to follow in this case. Is it possible to bring the lag to zero, or should I modify my alerting?
我没有遇到数据一致性问题,而且我收到了所有的记录。我想知道在这种情况下应该遵循什么最佳实践。是否可能将滞后降至零,还是应该修改我的警报?

I am not aware of the internal code implementation for this sink connector. I am using a JAR file.
我不了解此汇接器的内部代码实现。我正在使用一个JAR文件。

I am using configurations from here :
我正在使用这里的配置:连接器配置的链接

英文:

I am new to Kafka connect.
I am using a Kafka Sink Connector to write data to a datalake. The connector writes the data to a staging location and then moves it to the target location. However, when it processes all the data in the topic, and I check the lags in Kafka Manager, it always shows 1 if there is no data in the topic and it has processed all the records. It doesn’t go to 0.
When there is data in topic again then lag becomes greater as expected and in sometime it comes down to 1 again after processing the records.
Here is the config which I am using:

"config": {
    "tasks.max": "1",
    "topics": "xxx",
    "adls.endpoint": "xxxx",
    "adls.container.name": "xxxx",
    "adls.auth.method": "ClientSecret",
    "adls.tenant.id": "xxxx",
    "adls.client.id": "xxxx",
    "adls.client.secret": "xxxxxxxxx",
    "base.directory": "xxxxxxxxxxxx",
    "rotation.filesize" : "1000000000",
    "rotation.inactivity" : "1800000",
    "rotation.record.count" : "100000",
    "auto.offset.reset":"earliest",
   "commit.rotated.only":false
  } 

I would like to understand how to bring the lag to zero, as it’s triggering an alert for me when the lag remains at 1 for a longer period of time.
I am not facing any data consistency issues, and I am receiving all the records. I would like to know what the best practice is to follow in this case. Is it possible to bring the lag to zero, or should I modify my alerting?
I am not aware of the internal code implementation for this sink connector. I am using a JAR file.

I am using configurations from here :

LINK OF CONNECTOR CONFIGURATIONS

答案1

得分: 1

这可能是由于Kafka事务引起的,根据提供的配置难以确定确切情况。已知存在一个问题,当使用Kafka事务时,由于最后一条消息是提交消息,不会被消费者读取,因此消费者滞后永远不会达到0。

相关问题链接:
https://stackoverflow.com/questions/67966228/akhq-ui-shows-consumer-lag-always-1-for-jdbc-sink-connector-even-though-all-mess

建议将警报设置为当消费者滞后大于1时触发。

英文:

Hard to tell exactly based on the provided config, but it could be due to Kafka transactions. There is a known issue that when using Kafka transactions, the consumer lag never reaches 0, due to the last message being a commit message, that will not be read by the consumer.

Consumer.position() Ignores Transaction Marker with read_uncommitted

And a related answer

https://stackoverflow.com/questions/67966228/akhq-ui-shows-consumer-lag-always-1-for-jdbc-sink-connector-even-though-all-mess

I would suggest changing your alerting to when consumer lag is > 1 to account for this.

huangapple
  • 本文由 发表于 2023年3月21日 03:26:48
  • 转载请务必保留本文链接:https://go.coder-hub.com/75794506.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定