英文:
Why partition is needed in Shopify Sarama consumer to consume messages
问题
很抱歉,我只能为你提供翻译服务,无法回答关于Kafka库的问题。以下是你要翻译的内容:
对于发布与Kafka库相关的问题,我很抱歉,因为对于库特定的问题,很少有人感兴趣。但是这个库是用于golang-Kafka实现中最常用的库之一。
我想使用Sarama库创建一个简单的消费者,用于监听一个主题。据我所知,在高级Kafka API中,默认情况下,如果没有指定特定的分区,消费者会监听所有主题的分区。然而,在这个库中,消费者接口只有ConsumePartition函数,其中需要指定分区作为参数。函数的签名如下:
ConsumePartition(topic string, partition int32, offset int64) (PartitionConsumer, error)
这让我有些困惑。有没有人在这方面有经验?
另外,我有一个关于Kafka的基本问题。如果我有一个由3个消费者实例组成的消费者组,它们分别监听2个主题,每个主题有2个分区,那么我是否需要明确指定哪个消费者实例将消费哪个分区,或者Kafka Fetch API会根据负载自动处理?
英文:
I am sorry for posting a question related to a Kafka Library as not many people are interested in Library specific questions. But this library is one of the most used library for golang-Kafka implementations.
I want to create a simple consumer using Sarama library which listens to a topic. Now as far as I know, in the high-level Kafka API's, by default a consumer listens to all the topics partitions if a specific partition is not specified. However, in this Library, the Consumer interface has only ConsumePartition function where the partition is required param. The signature of function is:
ConsumePartition(topic string, partition int32, offset int64) (PartitionConsumer, error)
This confuses me a bit. Anyone who has worked on it?
Also, I have a basic question regarding Kafka. If I have a consumer group consisting of 3 consumer instances and they are listening to let's say 2 topics each having 2 partitions, then do I need to specifically mention which consumer instance will consume to which partition or Kafka Fetch API will take care of it on its own based on load?
答案1
得分: 3
我使用的是sarama-cluster,这是Sarama的一个开源扩展(也被Shopify Sarama推荐)。
使用Sarama cluster,你可以使用以下API创建一个消费者:
cluster.NewConsumer(brokers, consumerGroup, topics, kafkaConfig)
因此,不需要分区。你只需要提供Kafka的brokers
地址、consumer group
的名称以及你希望消费的topics
。
为了保持顺序,你应该将每个分区分配给一个消费者。
所以,如果你的消费者组中有3个消费者,你希望它们消费2个每个有2个分区的主题,你应该按照以下方式分配:
分区1,2 -> 消费者A
分区3 -> 消费者B
分区4 -> 消费者C
你可能会遇到一个消费者前进得更快(其中一个主题的吞吐量更高),你需要重新平衡。
建议使用一个处理这个问题的库(比如sarama-cluster)。
英文:
I use sarama-cluster which is an open source extension for Sarama (also recommended by Shopify Sarama).
With Sarama cluster you can create a consumer using this API:
cluster.NewConsumer(brokers, consumerGroup, topics, kafkaConfig)
so no partition is needed. You should only provide the addresses of your Kafka brokers
, the name of your consumer group
and which topics
you wish to consume.
To maintain order you should assign to each partition only one consumer.
So in case you have 3 consumers in your consumer group and you want them to consume 2 topics having 2 partitions each, you should assign as follows:
partitions 1,2 -> consumer A
partition 3 -> consumer B
partition 4 -> consumer C
You might end up with one of the consumers advancing faster (one of the topics have higher throughput) and you will need to re-balance.
Using a library (like sarama-cluster) that handles this for you is recommended.
答案2
得分: 0
与此同时,Sarama集群项目已被弃用,请参阅其存储库中的弃用通知。好消息是,它已被弃用,而PR 实现更高级的消费者组 #1099则专注于消费主题而不是专用分区。
官方sarama存储库中的examples/
文件夹提供了一个很好的消费者组实现示例。它运行得很好,不需要在Sarama之上添加任何额外的库。
预览:
config := sarama.NewConfig()
client, err := sarama.NewConsumerGroup(strings.Split(brokers, ","), "mygroup", config)
client.Consume(ctx, strings.Split(topics, ","), &consumer) // 应该在循环中运行
英文:
Meanwhile the Sarama cluster project has been deprecated, see the Deprecation Notice in their repo. The good news it has been deprecated in favor of the PR Implement a higher-level consumer group #1099,
which focusses on consuming topic(s) rather than dedicated partitions.
The examples/
folder in the official sarama repo provides a good Consumer Group Implementation Example. It works like a charm and doesn't require any additional libraries on top of Sarama.
Preview:
config := sarama.NewConfig()
client, err := sarama.NewConsumerGroup(strings.Split(brokers, ","), "mygroup", config)
client.Consume(ctx, strings.Split(topics, ","), &consumer) // should run in a loop
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论