问题

我想知道是什么原因导致了这个问题以及如何修复。kafka的__consumer_offsets磁盘使用量非常大（134 GB），且不对称（主要在broker 3上，大部分是单个分区）。复制因子为3，有3个broker，所以至少期望对称性，尽管我更担心减小大小。

MSK版本为2.8.1，命令行使用的是confluent 6.2.10。

$ kafka-log-dirs --describe --bootstrap-server $BOOTSTRAP --topic-list __consumer_offsets | grep '{' | jq -r '.brokers[] | ["broker", .broker, "=", (([.logDirs[].partitions[].size] | add // 0) | . / 10000 | round | ./ 100), "MB" ] | @tsv' | paste -sd , | tr '\t' ' '
broker 1 = 459.72 MB,broker 2 = 218.95 MB,broker 3 = 134346.48 MB

$ kafka-log-dirs --describe --bootstrap-server $BOOTSTRAP --topic-list __consumer_offsets | grep '{' | jq -r '.brokers[] | ["broker", .broker, "=", (.logDirs[].partitions[].size / 1000000 | round)] | @tsv' | tr '\t' ' '
broker 1 = 52 1 0 0 1 0 0 243 102 0 2 0 3 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 1 47 4 0 0 0 0 0 0 5 0 0 0 0 1 2 3 1 0 0 0
broker 2 = 52 1 0 0 1 0 0 2 102 0 2 0 3 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 1 47 4 0 0 0 0 0 0 5 0 0 0 0 1 2 3 1 0 0 0
broker 3 = 133907 1 0 0 8 3 1 31 10 4 2 0 27 0 2 0 14 8 4 4 1 0 3 2 0 10 0 0 3 14 35 123 0 0 2 0 0 0 23 0 0 0 0 25 26 39 9 3 6 5

$ kafka-topics --bootstrap-server $BOOTSTRAP --describe --topic __consumer_offsets
Topic: __consumer_offsets	TopicId: ...	PartitionCount: 50	ReplicationFactor: 3	Configs: compression.type=producer,min.insync.replicas=2,cleanup.policy=compact,segment.bytes=104857600,message.format.version=2.8-IV1,max.message.bytes=10485880,unclean.leader.election.enable=true
...

英文:

I'm wonder what could cause this and how to fix.
kafka __consumer_offsets disk usage is huge (134 GB)
and asymmetric (mostly on broker 3, and mostly a single partition).
ReplicationFactor=3 and there are 3 brokers so I would at least expect symmetry,
although I am more concerned about reducing the size.

MSK 2.8.1 and confluent 6.2.10 for the command-line.

$ kafka-log-dirs --describe --bootstrap-server $BOOTSTRAP --topic-list __consumer_offsets | grep &#39;^{&#39; | jq -r &#39;.brokers[] | [&quot;broker&quot;, .broker, &quot;=&quot;, (([.logDirs[].partitions[].size] | add // 0) | . / 10000 | round | ./ 100), &quot;MB&quot; ] | @tsv&#39; | paste -sd , | tr &#39;\t&#39; &#39; &#39;
broker 1 = 459.72 MB,broker 2 = 218.95 MB,broker 3 = 134346.48 MB

$ kafka-log-dirs --describe --bootstrap-server $BOOTSTRAP --topic-list __consumer_offsets | grep &#39;^{&#39; | jq -r &#39;.brokers[] | [&quot;broker&quot;, .broker, &quot;=&quot;, (.logDirs[].partitions[].size / 1000000 | round)] | @tsv&#39; | tr &#39;\t&#39; &#39; &#39;
broker 1 = 52 1 0 0 1 0 0 243 102 0 2 0 3 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 1 47 4 0 0 0 0 0 0 5 0 0 0 0 1 2 3 1 0 0 0
broker 2 = 52 1 0 0 1 0 0 2 102 0 2 0 3 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 1 47 4 0 0 0 0 0 0 5 0 0 0 0 1 2 3 1 0 0 0
broker 3 = 133907 1 0 0 8 3 1 31 10 4 2 0 27 0 2 0 14 8 4 4 1 0 3 2 0 10 0 0 3 14 35 123 0 0 2 0 0 0 23 0 0 0 0 25 26 39 9 3 6 5

$ kafka-topics --bootstrap-server $BOOTSTRAP --describe --topic __consumer_offsets
Topic: __consumer_offsets	TopicId: ...	PartitionCount: 50	ReplicationFactor: 3	Configs: compression.type=producer,min.insync.replicas=2,cleanup.policy=compact,segment.bytes=104857600,message.format.version=2.8-IV1,max.message.bytes=10485880,unclean.leader.election.enable=true
...

答案1

得分: 1

这里真的没有太多可以做的了，这一点上。

tl;dr 您有将消费者组名称散列到相同分区，或者有一个非常大的消费者组，它频繁提交。

主题是紧凑的，因此数据会保留。频繁的消费者提交可能比紧凑发生得更快，导致该分区迅速增长。
您可以消费该主题以进行检查（确保添加 --property print.key=true），您会注意到键是根据您的消费者代码中设置的 group.id。
如果更改您的消费者的 group.id（假设您可以追踪到它们），那么它们将失去已经消耗的数据的任何现有偏移量，除非进行一些手动干预，使用seek 和 commitSync 消费者 API 的组合来迁移偏移量；没有内置脚本可以为您执行此操作。

如果最终“移动”该分区或日志的部分，例如，那么它将开始导致消费者出现错误，因为经纪人仍将尝试使用“分区 X” 来获取/提交偏移量。

实际上，人们经常有一些与 ACL 策略和预期使用配额相关的消费者/主题入职表单，以及那个“Kafka 入职流程<sup>TM</sup>”，可以强制执行特定的消费者组命名约定。

英文:

There's not much that can really be done with this, at this point.

tl;dr You have consumer groups names that are hashed to the same partition, or you have one really large consumer group that does very frequent commits.

The topic is compact, so data stays around. Frequent consumer commits can happen faster than compaction happens, causing that partition to grow quickly.
You can consume that topic to inspect it (make sure you add --property print.key=true), you will notice that the keys are by the group.id set in your consumer code bases.
If you change group.id for your consumers (assuming you can track them down), then they will all lose any existing offsets for the data they've already consumed without some manual intervention to migrate offsets using a combination of seek and commitSync consumer API; there is no built-in script to do this for you.

If you end up "moving" that partition, or segments of the logs, for example, then it'll start causing errors in consumers since the broker will still try to use "partition X" to fetch/commit offsets.

In practice, people often have some consumer / topic onboarding form, associated with ACL policies and expected usage quotas, and that "Kafka onboarding process<sup>TM</sup> " can enforce specific consumer group naming conventions.

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

kafka __consumer_offsets 磁盘使用量巨大且不对称

问题

答案1

Kafka代理可以通过多个端口进行连接吗？

从Kafka主题读取消息并将其转储到HDFS。

Kafka的偏移量即使在未达到确认(Ack)时也会增加。

如何访问由Strimzi创建的Kafka Connect？

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论