有一些 Kafka 分区没有被分配给任何 Flink 消费者。

huangapple go评论63阅读模式
英文:

Few kafka partitions are not getting assigned to any flink consumer

问题

我有一个包含15个分区[0-14]的Kafka主题,我正在以5个并行度运行Flink。所以理论上每个并行的Flink消费者应该消费3个分区。但是,即使在多次重启后,一些Kafka分区仍然没有被任何Flink工作者订阅。

org.apache.kafka.clients.consumer.KafkaConsumer assign 分配到分区:topic_name-13, topic_name-8, topic_name-9
org.apache.kafka.clients.consumer.KafkaConsumer assign 分配到分区:topic_name-11, topic_name-12, topic_name-13
org.apache.kafka.clients.consumer.KafkaConsumer assign 分配到分区:topic_name-14, topic_name-0, topic_name-10
org.apache.kafka.clients.consumer.KafkaConsumer assign 分配到分区:topic_name-5, topic_name-6, topic_name-10
org.apache.kafka.clients.consumer.KafkaConsumer assign 分配到分区:topic_name-2, topic_name-3, topic_name-7

从上面的日志中可以看出,分区10和13已被2个消费者订阅,而分区1和4则完全未被订阅。

注:如果我以1个并行度启动作业,作业完全正常工作。

Flink版本:1.3.3
英文:

I have a kafka topic with 15 partitions [0-14] and I'm running flink with 5 parallelism. So ideally each parallel flink consumer should consume 3 partitions each. But even after multiple restarts, few of the kafka partitions are not subscribed by any flink workers.

org.apache.kafka.clients.consumer.KafkaConsumer assign  Subscribed to partition(s): topic_name-13, topic_name-8, topic_name-9
org.apache.kafka.clients.consumer.KafkaConsumer assign  Subscribed to partition(s): topic_name-11, topic_name-12, topic_name-13
org.apache.kafka.clients.consumer.KafkaConsumer assign  Subscribed to partition(s): topic_name-14, topic_name-0, topic_name-10
org.apache.kafka.clients.consumer.KafkaConsumer assign  Subscribed to partition(s): topic_name-5, topic_name-6, topic_name-10
org.apache.kafka.clients.consumer.KafkaConsumer assign  Subscribed to partition(s): topic_name-2, topic_name-3, topic_name-7

From the above logs, it shows that partitions 10 and 13 have been subscribed by 2 consumers and partition 1 and 4 are not subscribed at all.

Note: If I start the job with 1 parallelism, the job works perfectly fine.

Flink Version: 1.3.3

答案1

得分: 0

这听起来像是 https://issues.apache.org/jira/browse/FLINK-7143

阅读 Jira 工单和拉取请求 (https://github.com/apache/flink/pull/4301) 中的详细信息,似乎如果你正在使用 Flink 1.3.x 版本,只有在进行全新重启时才能从此错误修复中受益。从一个保存点重新启动并不能充分享受到这个修复带来的好处。

英文:

This sounds like https://issues.apache.org/jira/browse/FLINK-7143.

Reading through the details in the Jira ticket and in the pull request (https://github.com/apache/flink/pull/4301), it sounds like if you are on Flink 1.3.x you can only benefit from this bug fix if you do a fresh restart. Restarting from a savepoint isn't enough to benefit from the fix.

huangapple
  • 本文由 发表于 2020年9月8日 02:37:22
  • 转载请务必保留本文链接:https://go.coder-hub.com/63782684.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定