Custom compaction for Kafka topic on the broker side? 在代理端自定义Kafka主题压缩?

huangapple go评论67阅读模式
英文:

Custom compaction for Kafka topic on the broker side?

问题

根据我的理解,您希望将以下部分翻译成中文:

"Assume some Kafka cluster with some topic named MyTopic. According to business logic I am implementing, adjancent records are considered equal whenever some subset of value's rather then key's properties are equal. Thus, built-in compaction, driven by key equality, doesn't work for my scenario. I could implement pseudocompaction at the consumer side, which is neither an option due to performance. The whole idea is to maintain right compaction at the broker side. In addition to that, such a compaction has to be applied only within some special consumer group; all other groups have to get entire log of records as they are now.

According to my knowledge there is no way to implement such compaction. Am I wrong?"

以下是翻译的部分:

假设某个Kafka集群中有一个名为 MyTopic 的主题。根据我正在实现的业务逻辑,每当一些 value 的属性相等而不是 key 的属性相等 时,相邻的记录被视为相等。因此,由键相等驱动的内置压缩对我的情况不起作用。我可以在消费者端实现伪压缩,但由于性能原因,这不是一个选项。整个思路是在经纪人端维护正确的压缩。除此之外,这种压缩只能应用于某个特定的消费者组;所有其他组必须像现在一样获取整个记录日志。

根据我的了解,没有办法实现这种压缩。我错了吗?

英文:

Assume some Kafka cluster with some topic named MyTopic. According to business logic I am implementing, adjancent records are considered equal whenever some subset of value's rather then key's properties are equal. Thus, built-in compaction, driven by key equality, doesn't work for my scenario. I could implement pseudocompaction at the consumer side, which is neither an option due to performance. The whole idea is to maintain right compaction at the broker side. In addition to that, such a compaction has to be applied only within some special consumer group; all other groups have to get entire log of records as they are now.

According to my knowledge there is no way to implement such compaction. Am I wrong?

答案1

得分: 2

你不能拥有自定义的日志压缩。它要么根据键进行删除,要么根据键进行压缩。详见:https://kafka.apache.org/documentation/#compaction

但是,如果您的情况只涉及某些特殊的消费者组,您可以创建一个流来读取您指定的主题,创建一个哈希键(基于值子集),然后将其写入另一个主题,并将清理策略设置为“压缩”以适用于这个新主题。

显然,这将具有几乎重复的数据,这可能不适合您的情况。

英文:

You can not have custom log compaction. It is either delete or compact based on keys. https://kafka.apache.org/documentation/#compaction

However, if your case is just related to some special consumer groups, you might create a stream to read your specified topic, create a hash key (based on value subset) which will write to another topic and apply clean up policy compaction to this new topic.

This obviously will have almost duplicated data which might not suit your case.

答案2

得分: 2

这个问题已经被正确回答,目前还不可行。但值得注意的是,KIP-280已经获得批准,将添加新的压缩策略。它当前的目标版本是Kafka 2.5。

看起来你的目标可以通过新的头部策略来实现。

英文:

This question has already been answered correct, ie it's not currently possible. But it's worth noting that KIP-280 has been approved and will add new compaction policies. It is currently targeted for Kafka 2.5.

It looks like your goal would be achieved with the new header policy.

huangapple
  • 本文由 发表于 2020年1月6日 23:14:08
  • 转载请务必保留本文链接:https://go.coder-hub.com/59614537.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定