使用Kafka消费者在Camel路由中时,Apache Karaf的CPU负载较高。

huangapple go评论65阅读模式
英文:

High CPU load on Apache Karaf when using Kafka consumers in Camel route

问题

我们在apache-karaf-4.4.0中使用了Apache Camel 3.11.7来从Apache-Kafka-2.13-3.3.1中的3个节点读取数据。
我们有约10000个主题,每个主题有一个消费者。
Apache-Karaf-VM有2个CPU和32 GiB内存。

我们使用Apache Camel路由来使用Java从主题中消费消息,我们将此代码集成到Apache-Karaf中作为Bundle:

from ("kafka://crs.topic?brokers=PLAINTEXT://kafka1:9092,kafka2:9093,kafka3:9094&keyDeserializer=org.apache.kafka.common.serialization.StringDeserializer&valueDeserializer=com.rwe.remit.ejb.backend.kafkaclient.api.JacksonReadingSerializer&groupId=testing&heartbeatIntervalMs=120000&maxPollIntervalMs=86400000&sessionTimeoutMs=86400000&maxPollIntervalMs=86400000&deliveryTimeoutMs=86400000&requestTimeoutMs=86400000")

在启动Apache Karaf中的所有900个Bundle后,消费者会监听主题以获取主题中的数据。

在我们开始测试和向主题发送数据之前,我们在Apache Karaf VM上发现CPU负载达到了100%。这种情况一直持续下去。

我们在Kafka-Leader-Node上有以下配置:

process.roles=broker,controller
node.id=1
controller.quorum.voters=1@kafka1:19092,2@kafka2:19093,3@kafka3:19094
listeners=PLAINTEXT://kafka1:9092,CONTROLLER://kafka1:19092
inter.broker.listener.name=PLAINTEXT
advertised.listeners=PLAINTEXT://kafka1:9092
controller.listener.names=CONTROLLER
listener.security.protocol.map=CONTROLLER:PLAINTEXT,PLAINTEXT:PLAINTEXT,SSL:SSL,SASL_PLAINTEXT:SASL_PLAINTEXT,SASL_SSL:SASL_SSL
num.network.threads=3
num.io.threads=8
socket.send.buffer.bytes=102400
socket.receive.buffer.bytes=102400
socket.request.max.bytes=104857600
log.dirs=/data/apache-karaf/node1/log
num.partitions=3
num.recovery.threads.per.data.dir=1
offsets.topic.replication.factor=3
transaction.state.log.replication.factor=3
transaction.state.log.min.isr=3
log.retention.hours=168
log.segment.bytes=1073741824
log.retention.check.interval.ms=864000000
group.max.session.timeout.ms=86400000

我们现在正在将ActiveMQ迁移到Kafka。
以前,我们在使用Apache ActiveMQ而不是Kafka来消费队列中的消息时,并没有性能问题。您是否知道为什么在Linux-VM上有很大的负载?是否有办法显示Java或Apache Karaf中的所有内部进程和流量?

以下是两个Kafka消费者的线程转储,其他10000个主题都是相同的。
消费者一直在运行(100%)。是否有一种方式可以配置消费者,使它们只在主题有数据时运行,否则保持待机模式?

"Camel (crs-rwe-tenants-test) thread #71 - KafkaConsumer[crs.topic1]" #394 daemon prio=5 os_prio=0 tid=0x00007f17b8374000 nid=0x1acf51 runnable [0x00007f1796253000]
   java.lang.Thread.State: RUNNABLE
	at java.lang.Integer.valueOf(Integer.java:832)
	...
"Camel (crs-rwe-tenants-allgaeukraft) thread #79 - KafkaConsumer[crs.topic2]" #413 daemon prio=5 os_prio=0 tid=0x00007f17b895b000 nid=0x1acf64 runnable  [0x00007f1795142000]
java.lang.Thread.State: RUNNABLE
	at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
	...

最好的问候,Amjad

英文:

We use Apache Camel 3.11.7 in apache-karaf-4.4.0 to read data from Apache-Kafka-2.13-3.3.1 using 3 nodes.
We have about 10000 Topics and one consumer for every topic.
The Apache-Karaf-VM has 2 CPUs and 32 GiB.

We use Apache Camel Route to consume the Messages from Topics using java, we integrate this code in Apache-Karaf as Bundles:

from ("kafka://crs.topic?brokers=PLAINTEXT://kafka1:9092,kafka2:9093,kafka3:9094&keyDeserializer=org.apache.kafka.common.serialization.StringDeserializer&valueDeserializer=com.rwe.remit.ejb.backend.kafkaclient.api.JacksonReadingSerializer&groupId=testing&heartbeatIntervalMs=120000&maxPollIntervalMs=86400000&sessionTimeoutMs=86400000&maxPollIntervalMs=86400000&deliveryTimeoutMs=86400000&requestTimeoutMs=86400000")

After starting all 900 Bundles in Apache Karaf, the consumers listen to the topics to get data from the topics.

Befor we beginn testing and sending data to Topics, we found on the Apache Karaf VM that the CPU has 100% load. This stays all the time.

We have this configuration on Kafka-Leader-Node:

process.roles=broker,controller
node.id=1
controller.quorum.voters=1@kafka1:19092,2@kafka2:19093,3@kafka3:19094
listeners=PLAINTEXT://kafka1:9092,CONTROLLER://kafka1:19092
inter.broker.listener.name=PLAINTEXT
advertised.listeners=PLAINTEXT://kafka1:9092                                                                                                                                                                                                 controller.listener.names=CONTROLLER
listener.security.protocol.map=CONTROLLER:PLAINTEXT,PLAINTEXT:PLAINTEXT,SSL:SSL,SASL_PLAINTEXT:SASL_PLAINTEXT,SASL_SSL:SASL_SSL
num.network.threads=3
num.io.threads=8
socket.send.buffer.bytes=102400
socket.receive.buffer.bytes=102400
socket.request.max.bytes=104857600
log.dirs=/data/apache-karaf/node1/log
num.partitions=3
num.recovery.threads.per.data.dir=1
offsets.topic.replication.factor=3
transaction.state.log.replication.factor=3
transaction.state.log.min.isr=3
log.retention.hours=168
log.segment.bytes=1073741824
log.retention.check.interval.ms=864000000
group.max.session.timeout.ms=86400000

We are migrating ActiveMQ to Kafka now.
We used before the Apache Karaf and Camel using Apache ActiveMQ instead of Kafka to consume the messages from the Queues. We did not have Performance Problem when using ActiveMQ.

Do you have any idea, why do we have much load on Linux-VM? Is there any way to show all internal processes and traffic in java or in Apache Karaf?

Below is the Thread Dump for two Kafka Consumers, the other 10000 Topics are the same.
The consumers are running all the time (100%). Why?
Is there a way to configure the consumers, that they run only when Topics have Data otherwise to remain in Standby-Mode?

"Camel (crs-rwe-tenants-test) thread #71 - KafkaConsumer[crs.topic1]" #394 daemon prio=5 os_prio=0 tid=0x00007f17b8374000 nid=0x1acf51 runnable [0x00007f1796253000]
   java.lang.Thread.State: RUNNABLE
	at java.lang.Integer.valueOf(Integer.java:832)
	at sun.nio.ch.EPollSelectorImpl.updateSelectedKeys(EPollSelectorImpl.java:120)
	at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:98)
	at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86)
	- locked <0x00000006d72f07c8> (a sun.nio.ch.Util$3)
	- locked <0x00000006d72f07b8> (a java.util.Collections$UnmodifiableSet)
	- locked <0x00000006d72cea40> (a sun.nio.ch.EPollSelectorImpl)
	at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97)
	at org.apache.kafka.common.network.Selector.select(Selector.java:873)
	at org.apache.kafka.common.network.Selector.poll(Selector.java:465)
	at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:560)
	at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:280)
	at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:251)
	at org.apache.kafka.clients.consumer.KafkaConsumer.pollForFetches(KafkaConsumer.java:1306)
	at org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1242)
	at org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1168)
	at org.apache.camel.component.kafka.KafkaConsumer$KafkaFetchRecords.doPollRun(KafkaConsumer.java:351)
	at org.apache.camel.component.kafka.KafkaConsumer$KafkaFetchRecords.doRun(KafkaConsumer.java:279)
	at org.apache.camel.component.kafka.KafkaConsumer$KafkaFetchRecords.run(KafkaConsumer.java:246)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:750)

"Camel (crs-rwe-tenants-allgaeukraft) thread #79 - KafkaConsumer[crs.topic2]" #413 daemon prio=5 os_prio=0 tid=0x00007f17b895b000 nid=0x1acf64 runnable  [0x00007f1795142000]
java.lang.Thread.State: RUNNABLE
	at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
	at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269)
	at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:93)
	at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86)
	- locked <0x00000006d78c5610> (a sun.nio.ch.Util$3)
	- locked <0x00000006d78c5600> (a java.util.Collections$UnmodifiableSet)
	- locked <0x00000006d78a3710> (a sun.nio.ch.EPollSelectorImpl)
	at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97)
	at org.apache.kafka.common.network.Selector.select(Selector.java:873)
	at org.apache.kafka.common.network.Selector.poll(Selector.java:465)
	at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:560)
	at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:280)
	at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:251)
	at org.apache.kafka.clients.consumer.KafkaConsumer.pollForFetches(KafkaConsumer.java:1306)
	at org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1242)
	at org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1168)
	at org.apache.camel.component.kafka.KafkaConsumer$KafkaFetchRecords.doPollRun(KafkaConsumer.java:351)
	at org.apache.camel.component.kafka.KafkaConsumer$KafkaFetchRecords.doRun(KafkaConsumer.java:279)
	at org.apache.camel.component.kafka.KafkaConsumer$KafkaFetchRecords.run(KafkaConsumer.java:246)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:750)

Best regards, Amjad

答案1

得分: 2

你写道:
"在我们开始测试并向主题发送数据之前,我们发现在Apache Karaf虚拟机上,CPU占用率为100%。这一直保持不变。"

我建议进行一系列线程转储,例如每5秒钟进行5-6次转储,并仔细分析它们。这应该显示哪些线程占用了所有的CPU以及它们在做什么。

如果您可以访问Red Hat资源,更好的方法是遵循KCS https://access.redhat.com/solutions/46596

一般来说,对于连接到Kafka的10000个主题订阅者,我希望在空闲CPU使用方面会有一些背景噪音。可能不会使用所有可用的CPU。

英文:

You wrote:
"Befor we beginn testing and sending data to Topics, we found on the Apache Karaf VM that the CPU has 100% load. This stays all the time."

I suggest to take a series of thread dumps, e.g. 5-6 dumps every 5 seconds and analyse them carefully. It should show what threads are taking up all the CPU and what they are doing.

If you have access to Red Hat resources, an even better way would be to follow KCS https://access.redhat.com/solutions/46596.

In general, with 10000 topic subscribers connected to Kafka, I would expect some background noise in terms of idle CPU usage. Likely not using all of the available CPU.

huangapple
  • 本文由 发表于 2023年7月27日 22:05:41
  • 转载请务必保留本文链接:https://go.coder-hub.com/76780566.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定