2020年10月15日 15:34:27go评论79阅读模式

英文:

How can make consumer group in spark kafka stream and assign comsumers to consumer group

问题

我有一个名为"topic_1"的主题，创建了4个分区。我需要在Kafka Spark流中并行读取。因此我需要创建一个消费者组和消费者。

你能帮忙说明如何做吗？

目前在Kafka Spark流中，每次只能从Kafka获取一个请求。

英文:

I have one topic having name topic_1 and created 4 partitions. I need to read parallel in Kafka spark stream. so I need to make one consumer group and consumers.

Can you plz help how can I do this?

For now Kafka spark stream, one time taking one request from Kafka.

答案1

得分: 1

假设您正在使用Spark中的KafkaUtils，它会自动利用Spark执行器的数量 * 每个执行器的核心数。

因此，如果您有2个Spark执行器，每个执行器有2个内核，Spark将自动并行处理4个主题分区。

在Kafka Spark Streaming集成中，输入任务的数量取决于主题中的分区数。如果您的主题有4个分区，Spark Streaming将为每个批次生成4个任务。

如果您有1个带1个内核的执行器，那么该内核将顺序执行这4个任务（无并行处理）。而如果您有2个每个带1个内核的执行器，那么每个内核将顺序执行2个任务（因此并行处理为2）。

有4个分区时，您应该配置以下任一选项，以实现最大的消费者并行处理能力：

1个带有4个内核的执行器
2个每个带有2个内核的执行器
4个每个带有1个内核的执行器

英文:

Assuming you are using KafkaUtils from Spark, it automatically will take advantage of the number of Spark Executors * Cores per Executor.

So, if you have 2 Spark Executors, with 2 Cores for each Executor, Spark will automatically consume 4 topic partitions in parallel.

In Kafka Spark Streaming integration, the number of input tasks are determined by the number of partitions in the topic. If your topic has 4 partitions, Spark Streaming will spawn 4 tasks for each batch.

If you have 1 Executor with 1 Core, then the core will sequentially executes the 4 tasks (no paralellism). Whereas if you have 2 Executor with 1 Core each, then each core will sequentially executes 2 tasks (so parallelism is 2).

With 4 partitions you should configure any of the following, to achieve max consumer parallellism:

1 Executor with 4 Cores
2 Executor with 2 Cores each
4 Executor with 1 Core each

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

如何在Spark Kafka流中创建消费者组并将消费者分配给消费者组。

问题

答案1

Java 和 C 中位左移的结果为什么不同？

Can Mac and Window users work together with openJDK8 using Eclipse?

Flutter错误：Kotlin无法找到所需的JDK工具。

Java关机钩子（ShutdownHook）不按我预期工作

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论