英文:
Kafka consumer to wait till all similar messages arrived
问题
我们有一个需求场景,消费者需要等待所有相似的消息都到达主题。例如,所有与相似的“aaaaa@gmai.com”电子邮件相关的发票应在消费者开始消费所有与电子邮件“aaaaa@gmai.com”相关的发票之前到达。基本上,所有发票都应该根据相同的电子邮件分组。
英文:
We have a requirement scenario that consumer to wait till all similar messages arrived in Topic. For example all the invoices of similar "email aaaaa@gmai.com" should come before consumer start consuming all invoices with email aaaaa@gmai.com . Basically all Invoices should be grouped against same email.
答案1
得分: 3
这是不可能的。Kafka服务器和消费者都不会知道何时是“完成”的;只有您自己的应用程序逻辑。主题内的数据仅由生产者发送的顺序进行排序(按偏移量,每个分区),而不是“按电子邮件(或其他嵌入的序列化字段)/事件类型”排序。
您需要在消费时显式聚合数据,比如在KTable或远程数据库中,然后解析出事件何时被认为是“相似的”,或者批次何时被认为是“完成的”(如果生产者不发送这样的事件,则可能永远不会发生)。
英文:
This isn't possible. Kafka server nor consumer will know when something is "complete"; only your own application logic. Data within a topic is only ordered by producer sends (by offset, per partition) not "by email (or other embedded, serialized field) / event-type"
You'll need to explicitly aggregate data, such as in a KTable or remote database upon consumption, then parse out what it means for events to be "similar", or when a batch is "complete" (which may never happen if the producer doesn't send such event)
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论