2023年4月19日 23:09:35go评论117阅读模式

英文:

Logstash very low throughput with RabbitMQ

问题

I am using the following setup for my logging pipeline

fluentbit -> logstash-frontend -> rmq -> logstash-backend -> opensearch

Now, the logstash-frontend is working fine and able to queue messages into RabbitMQ fine. The problem is that I am getting a very low throughput on logstash-backend.

This causes the Queue to pile up and eventually stall the whole setup.

Here are my configurations:

logstsh-frontend

    output {
      rabbitmq {
        durable =&gt; true
        exchange =&gt; &quot;logstash&quot;
        exchange_type =&gt; &quot;direct&quot;
        persistent =&gt; true
        host =&gt; &quot;opensearch-logging-cluster-rmq&quot;
        user =&gt; &quot;****&quot;
        password =&gt; &quot;****&quot;
      }
    }

logstash-backend

    input {
      rabbitmq {
        ack =&gt; false
        durable =&gt; true
        exchange =&gt; &quot;logstash&quot;
        exchange_type =&gt; &quot;direct&quot;
        host =&gt; &quot;opensearch-logging-cluster-rmq&quot;
        user =&gt; &quot;****&quot;
        password =&gt; &quot;****&quot;
        threads =&gt; 4
      }
    }

I have also set the following in the logstash-backend

logstash.yaml

pipeline:
      batch:
        size: 2048

jvm.options

    -Xms4g
    -Xmx4g
    11-13:-XX:+UseConcMarkSweepGC
    11-13:-XX:CMSInitiatingOccupancyFraction=75
    11-13:-XX:+UseCMSInitiatingOccupancyOnly

NOTE: I am running this whole setup in Google Kubernetes Engine

After starting the whole setup, I can see the Exchange and Queues
As well as connections, but the delivery rate is very slow - in the range of 300 messages/s

Exchange:

Queue:

Also, I see that there are ~70 queues created.I am running 3 replicas of logstash-frontend and backend

Any idea what I am doing wrong here?

英文:

I am using the following setup for my logging pipeline

fluentbit -> logstash-frontend -> rmq -> logstash-backend -> opensearch

Now, the logstash-frontend is working fine and able to queue messages into RabbitMQ fine. The problem is that I am getting a very low throughput on logstash-backend.

This causes the Queue to pile up and eventually stall the whole setup.

Here are my configurations:

logstsh-frontend

    output {
      rabbitmq {
        durable =&gt; true
        exchange =&gt; &quot;logstash&quot;
        exchange_type =&gt; &quot;direct&quot;
        persistent =&gt; true
        host =&gt; &quot;opensearch-logging-cluster-rmq&quot;
        user =&gt; &quot;****&quot;
        password =&gt; &quot;****&quot;
      }
    }

logstash-backend

    input {
      rabbitmq {
        ack =&gt; false
        durable =&gt; true
        exchange =&gt; &quot;logstash&quot;
        exchange_type =&gt; &quot;direct&quot;
        host =&gt; &quot;opensearch-logging-cluster-rmq&quot;
        user =&gt; &quot;****&quot;
        password =&gt; &quot;****&quot;
        threads =&gt; 4
      }
    }

I have also set the following in the logstash-backend

logstash.yaml

pipeline:
      batch:
        size: 2048

jvm.options

    -Xms4g
    -Xmx4g
    11-13:-XX:+UseConcMarkSweepGC
    11-13:-XX:CMSInitiatingOccupancyFraction=75
    11-13:-XX:+UseCMSInitiatingOccupancyOnly

NOTE: I am running this whole setup in Google Kubernetes Engine

After starting the whole setup, I can see the Exchange and Queues
As well as connections, but the delivery rate is very slow - in the range of 300 messages/s

Exchange:

Queue:

Also, I see that there are ~70 queues created.I am running 3 replicas of logstash-frontend and backend

Any idea what I am doing wrong here?

答案1

得分: 1

Firstly I think there might be some misconfiguration of Logstash. In RabbitMQ you generally publish to an exchange and and consume from a queue. Why does your logstash-backend specify an exchange and not a queue? I haven't used the Logstash RMQ input plugin, but I'm surprised this would even work i.e. you don't consume from an exchange!

Also, I'm not sure how you're ending up with ~70 queues in RMQ as you're not specifying a routing key in your logstash-frontend (using the key setting in Logstash config), so I assume the routing key would default to logstash (based on the Logstash docs - see here) and there should only be 1 queue. It might be worth looking at the bindings (and binding keys) for your "logstash" exchange in RMQ to see what's going on...

WRT performance, it's quite a complex topic and there're a number of things that it could be. A good place to start would be this blog post on RMQ performance.

Here's a good list of RMQ performance optimizations... just to call a few of these out:

Queues receiving more messages than your consumers can cope with could result in more CPU being used. You could try increasing the number of logstash-backend replicas...
Queues getting so big that messages are written to disk to free up RAM. I can see in your diagram that some of the queues appear to have millions of messages, so this could be a possibility... ensure RMQ nodes have enough RAM and messages are getting consumed as quickly as they're being produced (and not just sitting there).
Do you have a RMQ cluster or single instance and are you using durable storage? Clustering and message persistence can impact performance.
Prefetch settings and acknowledgement batching. I can see you've already turned acknowledgements off, so this should already be optimized.

Another thing not mentioned above:

A queue is a single-threaded resource. If you’ve designed your routing topology in such a way that allows for messages to be spread across multiple queues rather than just hammering all messages into a single queue, then you can take advantage of additional CPU resources and minimize the CPU-hit per message e.g. not sure where all the logs are coming from, but you could specify different routing keys (in logstsh-frontend) based on some criteria (e.g. source application or based on some type of timestamp algorithm) and configure multiple Logstash pipelines (in the logstash-backend) to consume from different queues.

A couple other misc suggestions

You could consolidate fluentbit and logstash-frontend to just use the OTEL (OpenTelemetry) agent with the RabbitMQ exporter.
You could also explore Kafka instead of RMQ, but not sure this will be any easier to configure!

FYI. where I work we've implemented something similar with a queuing/streaming layer, but we use OTEL Collector agent -> AWS Kinesis -> Logstash -> Elasticsearch

英文:

Firstly I think there might be some misconfiguration of Logstash. In RabbitMQ you generally publish to an exchange and and consume from a queue. Why does your logstash-backend specify an exchange and not a queue? I haven't used the Logstash RMQ input plugin, but I'm surprised this would even work i.e. you don't consume from an exchange!

WRT performance, it's quite a complex topic and there're a number of things that it could be. A good place to start would be this blog post on RMQ performance.

Here's a good list of RMQ performance optimizations... just to call a few of these out:

Queues receiving more messages than your consumers can cope with could result in more CPU being used. You could try increasing the number of logstash-backend replicas...
Queues getting so big that messages are written to disk to free up RAM. I can see in your diagram that some of the queues appear to have millions of messages, so this could be a possibility... ensure RMQ nodes have enough RAM and messages are getting consumed as quickly as they're being produced (and not just sitting there).
Do you have a RMQ cluster or single instance and are you using durable storage? Clustering and message persistence can impact performance.
Prefetch settings and acknowledgement batching. I can see you've already turned acknowledgements off, so this should already be optimized.

Another thing not mentioned above:

A queue is a single-threaded resource. If you’ve designed your routing topology in such a way that allows for messages to be spread across multiple queues rather than just hammering all messages into a single queue, then you can take advantage of additional CPU resources and minimise the CPU-hit per message e.g. not sure where all the logs are coming from, but you could specify different routing keys (in logstsh-frontend) based on some criteria (e.g. source application or based on some type of timestamp algorithm) and configure multiple Logstash pipelines (in the logstash-backend) to consume from different queues.

A couple other misc suggestions

You could consolidate fluentbit and logstash-frontend to just use the OTEL (OpenTelemetry) agent with the RabbitMQ exporter.
You could also explore Kafka instead of RMQ, but not sure this will be any easier to configure!

FYI. where I work we've implemented something similar with a queuing/streaming layer, but we use OTEL Collector agent -> AWS Kinesis -> Logstash -> Elasticsearch

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

Logstash与RabbitMQ的吞吐量非常低。

问题

答案1

MLRun，与查看REST API吞吐量相关的问题。

“rabbitmq-node”在成功建立连接后，在创建通道时出现ECONNRESET错误。

如何区分 int null 和默认为零的 int，以及实际上等于零的 int？

无法连接到RabbitMQ – ACCESS_REFUSED

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论