英文:
Custom compaction for Kafka topic on the broker side?
问题
根据我的理解,您希望将以下部分翻译成中文:
"Assume some Kafka cluster with some topic named MyTopic
. According to business logic I am implementing, adjancent records are considered equal whenever some subset of value's rather then key's properties are equal. Thus, built-in compaction, driven by key equality, doesn't work for my scenario. I could implement pseudocompaction at the consumer side, which is neither an option due to performance. The whole idea is to maintain right compaction at the broker side. In addition to that, such a compaction has to be applied only within some special consumer group; all other groups have to get entire log of records as they are now.
According to my knowledge there is no way to implement such compaction. Am I wrong?"
以下是翻译的部分:
假设某个Kafka集群中有一个名为 MyTopic
的主题。根据我正在实现的业务逻辑,每当一些 value 的属性相等而不是 key 的属性相等 时,相邻的记录被视为相等。因此,由键相等驱动的内置压缩对我的情况不起作用。我可以在消费者端实现伪压缩,但由于性能原因,这不是一个选项。整个思路是在经纪人端维护正确的压缩。除此之外,这种压缩只能应用于某个特定的消费者组;所有其他组必须像现在一样获取整个记录日志。
根据我的了解,没有办法实现这种压缩。我错了吗?
英文:
Assume some Kafka cluster with some topic named MyTopic
. According to business logic I am implementing, adjancent records are considered equal whenever some subset of value's rather then key's properties are equal. Thus, built-in compaction, driven by key equality, doesn't work for my scenario. I could implement pseudocompaction at the consumer side, which is neither an option due to performance. The whole idea is to maintain right compaction at the broker side. In addition to that, such a compaction has to be applied only within some special consumer group; all other groups have to get entire log of records as they are now.
According to my knowledge there is no way to implement such compaction. Am I wrong?
答案1
得分: 2
你不能拥有自定义的日志压缩。它要么根据键进行删除,要么根据键进行压缩。详见:https://kafka.apache.org/documentation/#compaction
但是,如果您的情况只涉及某些特殊的消费者组,您可以创建一个流来读取您指定的主题,创建一个哈希键(基于值子集),然后将其写入另一个主题,并将清理策略设置为“压缩”以适用于这个新主题。
显然,这将具有几乎重复的数据,这可能不适合您的情况。
英文:
You can not have custom log compaction. It is either delete or compact based on keys. https://kafka.apache.org/documentation/#compaction
However, if your case is just related to some special consumer groups, you might create a stream to read your specified topic, create a hash key (based on value subset) which will write to another topic and apply clean up policy compaction
to this new topic.
This obviously will have almost duplicated data which might not suit your case.
答案2
得分: 2
这个问题已经被正确回答,目前还不可行。但值得注意的是,KIP-280已经获得批准,将添加新的压缩策略。它当前的目标版本是Kafka 2.5。
看起来你的目标可以通过新的头部策略来实现。
英文:
This question has already been answered correct, ie it's not currently possible. But it's worth noting that KIP-280 has been approved and will add new compaction policies. It is currently targeted for Kafka 2.5.
It looks like your goal would be achieved with the new header policy.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论