Kafka消费者在多个主题下的行为表现

huangapple go评论64阅读模式
英文:

Kafka consumer behaviour with multiple topics

问题

  1. 为什么我们没有一种指定要轮询的主题的方法的理由是什么?
  2. 单次拉取/消费是否会返回来自不同主题的消息?
  3. 假设每个主题都有足够多的消息供多次轮询请求使用,每个轮询调用是否会以循环方式返回来自不同主题的消息?
  4. 如果两个主题的流量严重不平衡,那么拥有更多消息的主题是否会在轮询请求方面得到更多关注?

每个Kafka资源似乎告诉我一个消费者可以订阅多个主题,但令人惊讶的是,我找不到可以回答上述问题的有用信息。

英文:

For some reason, I have to use one consumer for two topics. Now the questions I have are:

  1. What is the reasoning behind we don't have a way to specify which topic we want to poll?
  2. Will a single pull/consume return messages from different topics?
  3. Assume each topic have enough messages for multiple poll requests, will each poll call return messages from different topics in round robin fashion?
  4. If the traffic of the two topics are strongly unbalanced, will the poll topic with more messages gets more attention wrt the poll request.

Every Kafka resource seems to tell me one consumer can subscribe to more than one topics but surprisingly, I can not find useful information which will answer the above questions.

答案1

得分: 1

  1. 一个消费者被视为一个订阅单元。例如,有一个配置参数 max.poll.interval 定义了在整个消费者被视为失败之前轮询之间的最大时间间隔。您可以根据其 topic() 值手动分派 ConsumerRecords。否则,创建多个消费者也可以。甚至可以从单个线程中轮询它们,只要做得有意识。

  2. 是的,会的。

  3. 记录将以一种“分批轮询”的方式返回。我发现这个 答案 有用。

  4. 只要您的消费者跟上了总共产生的消息,就没有意义,因为您将获得所有消息。因此,不需要额外关注。
    否则,我想,但无法保证,您将从未跟上的分区获得一定的速率,而从其他分区获得较低的速率。如果最终您无法跟上所有分区,您将从所有分区获得恒定的速率。

希望有所帮助。

英文:
  1. A consumer is considered as a subscription unit. For example, there is a configuration parameter max.poll.interval defining max period between polls before the whole consumer will be considered failed. You can manually dispatch ConsumerRecords according to its topic() value. Otherwise that's OK to create multiple consumers. You can even poll them from a single thread, just do it consciously.

  2. Yes, it will.

  3. Records would be returnd in a kind of "batched round robin" fashion. I've found that answer useful.

  4. As far as your consumer keeps up with messages being produced in total it doesn't make sense because you will get all the messages. So no extra attention could be applied.
    Otherwise, I suppose, but could not guarantee, you will get some constant rate from partitions you don't keep up with, and lower rate from other ones. If it ends up you don't keep up with all partitions you will get constant rate from all of them.

Hope it helped.

huangapple
  • 本文由 发表于 2023年7月11日 02:56:58
  • 转载请务必保留本文链接:https://go.coder-hub.com/76656575.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定