英文:
RabbitMQ cluster node failure with spring boot application
问题
以下是翻译好的内容:
我有一个连接到RabbitMQ集群的Spring Boot应用程序(作为云平台中的服务)。当集群中的主节点失败并且由于某种原因该节点无法启动,但应用程序(消息消费者)仍在尝试连接到已失败的节点,并且不会尝试连接到其他可用节点。是否有人可以提供一些Spring配置来解决这个问题?
17:36:23.829: [APP/PROC/WEB.0] 原因是:com.rabbitmq.client.ShutdownSignalException:通道错误;协议方法:#method<channel.close>(reply-code=404,reply-text=NOT_FOUND - 无法访问或无法访问持久队列'FAILED_ORDER'的主节点'rabbit@rad33f2b1-mq-1.node.dc1.svvc',类别ID=50,方法ID=10)
'rabbit@rad33f2b1-mq-1.node.dc1.svvc' 是已失败的节点。
为了在发生故障时持续尝试连接到节点,我有以下Spring配置:
spring.rabbitmq.listener.simple.missing-queues-fatal=false
@Configuration
public class MessageConfiguration {
public static final String FAILED_ORDER_QUEUE_NAME = "FAILED_ORDER";
public static final String EXCHANGE = "directExchange";
@Bean
public Queue failedOrderQueue(){
return new Queue(FAILED_ORDER_QUEUE_NAME);
}
@Bean
public DirectExchange directExchange(){
return new DirectExchange(EXCHANGE,true,false);
}
@Bean
public Binding secondBinding(Queue failedOrderQueue, DirectExchange directExchange){
return BindingBuilder.bind(failedOrderQueue).to(directExchange).with(FAILED_ORDER_QUEUE_NAME);
}
}
英文:
I have a spring boot application that is connected to a RabbitMQ cluster (as a service in cloud foundry). When the main node in the cluster fails and for some reason the node does not come up but the application (Message Consumer) was trying to connect to the failed node and does not try to connect to other available nodes. Could someone suggest some spring configurations to fix this issue ?
17:36:23.829: [APP/PROC/WEB.0] Caused by: com.rabbitmq.client.ShutdownSignalException: channel error; protocol method: #method<channel.close>(reply-code=404, reply-text=NOT_FOUND - home node 'rabbit@rad33f2b1-mq-1.node.dc1.svvc' of durable queue 'FAILED_ORDER' in vhost '/' is down or inaccessible, class-id=50, method-id=10)
'rabbit@rad33f2b1-mq-1.node.dc1.svvc' is the failed node.
In order to continuously try connecting to the nodes on failure, i have the following spring configuration.
spring.rabbitmq.listener.simple.missing-queues-fatal=false
@Configuration
public class MessageConfiguration {
public static final String FAILED_ORDER_QUEUE_NAME = "FAILED_ORDER";
public static final String EXCHANGE = "directExchange";
@Bean
public Queue failedOrderQueue(){
return new Queue(FAILED_ORDER_QUEUE_NAME);
}
@Bean
public DirectExchange directExchange(){
return new DirectExchange(EXCHANGE,true,false);
}
@Bean
public Binding secondBinding(Queue failedOrderQueue, DirectExchange directExchange){
return BindingBuilder.bind(failedOrderQueue).to(directExchange).with(FAILED_ORDER_QUEUE_NAME);
}
}
答案1
得分: 2
这可能发生在您使用不正确的主定位器(master locator)的非HA自动删除队列时。
如果主定位器不是 client-local
,则自动删除队列可能会在与我们连接的节点不同的节点上创建。在这种情况下,如果主机节点崩溃,您将遇到此问题。
为了避免自动删除队列的这个问题,请将 x-queue-master-locator
队列参数设置为 client-local
或在代理上设置策略,以便匹配此名称的队列执行相同的操作。
但是,您没有使用自动删除队列...
@Bean
public Queue failedOrderQueue(){
return new Queue(FAILED_ORDER_QUEUE_NAME);
}
在使用集群和非HA队列时,队列不会被复制,因此,如果拥有队列的节点崩溃,您将在拥有节点重新启动之前遇到此错误。
为了避免这个问题,请设置一个策略,使队列成为镜像(HA)队列。
https://www.rabbitmq.com/ha.html
英文:
This can happen when you are using a non-HA auto-delete queue with an incorrect master locator.
If the master locator is not client-local
, the auto-delete queue might be created on a different node to the one we are connected to. In that case, if the host node goes down, you will get this problem.
To avoid this problem with auto-delete queues, set the x-queue-master-locator
queue argument to client-local
or set a policy on the broker to do the same for queues matching this name.
However, you are not using an auto-delete queue...
@Bean
public Queue failedOrderQueue(){
return new Queue(FAILED_ORDER_QUEUE_NAME);
}
When using a cluster, and a non-HA queue, the queue is not replicated and so, if the owning node goes down, you will get this error until the owning node comes back up.
To avoid this problem, set a policy to make the queue a mirrored (HA) queue.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论