2020年10月1日 23:48:29go评论78阅读模式

英文:

RabbitMQ cluster node failure with spring boot application

问题

以下是翻译好的内容：

我有一个连接到RabbitMQ集群的Spring Boot应用程序（作为云平台中的服务）。当集群中的主节点失败并且由于某种原因该节点无法启动，但应用程序（消息消费者）仍在尝试连接到已失败的节点，并且不会尝试连接到其他可用节点。是否有人可以提供一些Spring配置来解决这个问题？

17:36:23.829: [APP/PROC/WEB.0] 原因是：com.rabbitmq.client.ShutdownSignalException：通道错误；协议方法：#method<channel.close>（reply-code=404，reply-text=NOT_FOUND - 无法访问或无法访问持久队列'FAILED_ORDER'的主节点'rabbit@rad33f2b1-mq-1.node.dc1.svvc'，类别ID=50，方法ID=10）

'rabbit@rad33f2b1-mq-1.node.dc1.svvc' 是已失败的节点。

为了在发生故障时持续尝试连接到节点，我有以下Spring配置：
spring.rabbitmq.listener.simple.missing-queues-fatal=false

@Configuration
public class MessageConfiguration {

    public static final String FAILED_ORDER_QUEUE_NAME = "FAILED_ORDER";

    public static final String EXCHANGE = "directExchange";

    @Bean
    public Queue failedOrderQueue(){
        return new Queue(FAILED_ORDER_QUEUE_NAME);
    }

    @Bean
    public DirectExchange directExchange(){
        return new DirectExchange(EXCHANGE,true,false);
    }

    @Bean
    public Binding secondBinding(Queue failedOrderQueue, DirectExchange directExchange){
        return BindingBuilder.bind(failedOrderQueue).to(directExchange).with(FAILED_ORDER_QUEUE_NAME);
    }
}

英文:

I have a spring boot application that is connected to a RabbitMQ cluster (as a service in cloud foundry). When the main node in the cluster fails and for some reason the node does not come up but the application (Message Consumer) was trying to connect to the failed node and does not try to connect to other available nodes. Could someone suggest some spring configurations to fix this issue ?

17:36:23.829: [APP/PROC/WEB.0] Caused by: com.rabbitmq.client.ShutdownSignalException: channel error; protocol method: #method&lt;channel.close&gt;(reply-code=404, reply-text=NOT_FOUND - home node &#39;rabbit@rad33f2b1-mq-1.node.dc1.svvc&#39; of durable queue &#39;FAILED_ORDER&#39; in vhost &#39;/&#39; is down or inaccessible, class-id=50, method-id=10)

'rabbit@rad33f2b1-mq-1.node.dc1.svvc' is the failed node.

In order to continuously try connecting to the nodes on failure, i have the following spring configuration.
spring.rabbitmq.listener.simple.missing-queues-fatal=false

@Configuration
public class MessageConfiguration {

public static final String FAILED_ORDER_QUEUE_NAME = &quot;FAILED_ORDER&quot;;

public static final String EXCHANGE = &quot;directExchange&quot;;

@Bean
public Queue failedOrderQueue(){
	return new Queue(FAILED_ORDER_QUEUE_NAME);
}

@Bean
public DirectExchange directExchange(){
	return new DirectExchange(EXCHANGE,true,false);
}

@Bean
public Binding secondBinding(Queue failedOrderQueue, DirectExchange directExchange){
	return BindingBuilder.bind(failedOrderQueue).to(directExchange).with(FAILED_ORDER_QUEUE_NAME);
}

}

答案1

得分: 2

这可能发生在您使用不正确的主定位器（master locator）的非HA自动删除队列时。

如果主定位器不是 client-local，则自动删除队列可能会在与我们连接的节点不同的节点上创建。在这种情况下，如果主机节点崩溃，您将遇到此问题。

为了避免自动删除队列的这个问题，请将 x-queue-master-locator 队列参数设置为 client-local 或在代理上设置策略，以便匹配此名称的队列执行相同的操作。

但是，您没有使用自动删除队列...

@Bean
public Queue failedOrderQueue(){
    return new Queue(FAILED_ORDER_QUEUE_NAME);
}

在使用集群和非HA队列时，队列不会被复制，因此，如果拥有队列的节点崩溃，您将在拥有节点重新启动之前遇到此错误。

为了避免这个问题，请设置一个策略，使队列成为镜像（HA）队列。

https://www.rabbitmq.com/ha.html

英文:

This can happen when you are using a non-HA auto-delete queue with an incorrect master locator.

If the master locator is not client-local, the auto-delete queue might be created on a different node to the one we are connected to. In that case, if the host node goes down, you will get this problem.

To avoid this problem with auto-delete queues, set the x-queue-master-locator queue argument to client-local or set a policy on the broker to do the same for queues matching this name.

However, you are not using an auto-delete queue...

@Bean
public Queue failedOrderQueue(){
    return new Queue(FAILED_ORDER_QUEUE_NAME);
}

When using a cluster, and a non-HA queue, the queue is not replicated and so, if the owning node goes down, you will get this error until the owning node comes back up.

To avoid this problem, set a policy to make the queue a mirrored (HA) queue.

https://www.rabbitmq.com/ha.html

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

RabbitMQ集群节点故障与Spring Boot应用程序

问题

答案1

如何避免异常遮盖？

如何在ArrayList中删除特定字符？

Java，石头剪刀布程序未正常工作。

如何模拟依赖于输入值的依赖项？

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论