如何重新连接到集群中的所有IBM队列管理器

huangapple go评论65阅读模式
英文:

How to Reconnect to all IBM Queue Managers in a cluster

问题

我们在各种微服务之间使用IBM MQ进行一些集成。这些应用程序非常关键,我们的目标是零停机时间。我们有三个队列管理器的集群,每个队列管理器都运行在不同的服务器上(在不同的AWS可用性区域),例如QM1运行在服务器1,QM2运行在服务器2,QM3运行在服务器3

我们配置了三个ConnectionFactory如下(请注意连接名称列表的差异):

var connectionFactory1 = new MQQueueConnectionFactory();
connectionFactory1.setConnectionNameList("server1,server2,server3");
connectionFactory1.setPort(1414);
....
var connectionFactory2 = new MQQueueConnectionFactory();
connectionFactory2.setConnectionNameList("server2,server3,server1");
connectionFactory2.setPort(1414);
....
var connectionFactory3 = a MQQueueConnectionFactory();
connectionFactory3.setConnectionNameList("server3,server1,server2");
connectionFactory3.setPort(1414);
....

这个设置的想法是能够同时利用这三个队列管理器。第一个连接工厂将从QM1中消费,第二个连接工厂将从QM2中消费,依此类推。

不时地,运行IBM队列管理器的服务器需要打补丁并重新启动。当这种情况发生时,显然需要关闭队列管理器。当一个队列管理器关闭时,所有流量将通过集群中的其他两个队列管理器重定向,因此消息的流动永远不会停止。

server1关闭时:

  • connectionFactory1 切换到 server2,server3 以从 QM2 消费
  • connectionFactory2 切换到 server2,server3 以从 QM2 消费
  • connectionFactory3 切换到 server3,server2 以从 QM3 消费

在打补丁之后,我们重新启动 QM1。

我们遇到的问题是,这三个连接工厂保持以上的切换状态,而根本不重新连接到 QM1。唯一能够恢复所需状态的方法是重新启动应用程序,但这并不是一个好的/可接受的解决方案。

在我们的客户端代码中,我们实现了一些弹性模式,以确定 QM1 何时重新启动,并重置 connectionFactory1(包装在 MQQueueConnectionFactory 周围的 Spring CachingConnectionFactory),以及停止和启动了所有为首选队列管理器消费的监听器容器,但这没有效果。唯一能够做到的方式是实际上重新启动 Spring 应用上下文,但这类似于实际重新启动应用程序。当您有许多这样的应用程序时,这确实不是一个好的解决方案。

我注意到 MQQueueConnectionFactory 有一个方法 setClientReconnectOptions(int options) throws javax.jms.JMSException,但阅读该方法的注释对我来说并不很清楚是否可以用于我们想要的目的。

在此先为您的建议表示感谢。

英文:

We use IBM MQ for some integration between various micro services. The applications are quite critical and we aim for zero down times. We have a cluster of three Queue Managers each one running on a different server (on a separate AWS availability zones) say QM1 on sever1, QM2 on server2 and QM3 on server3.

We configure three ConnectionFactory like below (note the connection name list differences):

var connectionFactory1 = new MQQueueConnectionFactory();
connectionFactory1.setConnectionNameList("server1,server2,server3");
conectionFactory1.setPort(1414);
....
var connectionFactory2 = new MQQueueConnectionFactory();
connectionFactory2.setConnectionNameList("server2,server3,server1");
conectionFactory2.setPort(1414);
....
var connectionFactory3 = new MQQueueConnectionFactory();
connectionFactory3.setConnectionNameList("server3,server1,server2");
conectionFactory3.setPort(1414);
....

The idea behind this setup is to be able to utilize all the three Queue Managers at the same time. The first connection factory will consume from QM1, the second connection factory will consume from QM2, and so on.

From time to time the servers running IBM Queue Managers need to be patched and restarted. When this happens, obviously the queue managers needs to be shut down.
While a Queue Manager is down all the traffic is redirected through the other two Queue Managers in the cluster so the flow of the messages never stops.

While server1 is down:

  • connectionFactory1 switches to server2,server3 so consuming from QM2
  • connectionFactory2 switches to server2,server3 so consuming from QM2
  • connectionFactory3 switches to server3,server2 so consuming from QM3

After patching server1 we start QM1.

The issue we have is that the three connection factories stay switched as above without reconnecting to QM1 at all. The only one way we were able to restore the desired state was by restarting the application which is not really a good/acceptable solution.

In our client code we implemented some resiliency patterns to find out when the QM1 comes back up and reset connectionFactory1 (spring CachingConnectionFactory wrapped around MQQueueConnectionFactory) as well as stopping and started all listener containers consuming for that QM1 as prefered queue manager but this had no effect. The only way we could do it was to actually restart Spring Application Context but this is similar to actually restarting the application. And when you have many such applications this is really not a good solution.

I noticed that MQQueueConnectionFactory has a method setClientReconnectOptions(int options)
throws javax.jms.JMSException
but reading the comment of that method did not make it very clear to me if that can be used for what we want.

Thank you in advance for your inputs.

答案1

得分: 1

重新连接选项是用于在连接失败后重新建立连接的选项。它不会因为连接集合不平衡而影响重新建立连接。

有关可重新连接的客户端的更多信息,请参阅IBM文档中的可重新连接的客户端

从应用程序内部解决这个问题并不容易,因为应用程序无法了解整个环境。这就是为什么IBM MQ现在具有一个称为Uniform Clusters with Application Rebalancing的功能,它正是做这个的。当您启动被重新启动的队列管理器时,集群(具有整套客户端连接应用程序的图片)注意到不平衡并告诉一些应用程序去其他地方。它利用了客户端应用程序根据上述选项重新连接的能力,但切换到另一个队列管理器的驱动程序来自当前连接的队列管理器,而不是由客户端确定。

Uniform Clusters功能和Application Balancing在IBM MQ V9.1.2中添加,并在随后的多个CD版本中进行了增强。因此,提供它的第一个LTS版本将是IBM MQ V9.2.0。

有关Uniform Clusters的更多信息,请参阅IBM文档中的关于Uniform Clusters

英文:

Reconnect options are for re-making the connection after a failure. It will not affect re-making the connection just because the set of connections is unbalanced.

For more about reconnectable clients, see Reconnectable clients in the IBM Docs.

This is not an easy problem to solve from inside the application because it does not know about the whole environment. That is why IBM MQ now has a feature called Uniform Clusters with Application Rebalancing which does EXACTLY this. When you start up the recycled queue manager, the cluster (which has a picture of the whole set of client connected applications) notices the imbalance and tells some of the applications to go elsewhere. It utilises the client application ability to reconnect as per the above options, but the driver to move to another queue manager comes from the queue manager it is currently connected to, rather than being determined by the client.

The Uniform Clusters feature and Application Balancing were added in IBM MQ V9.1.2 and enhanced in several of the subsequent CD releases. The first LTS release to provide it would therefore be IBM MQ V9.2.0.

For more about Uniform Clusters, see About uniform clusters in the IBM Docs.

huangapple
  • 本文由 发表于 2023年3月1日 13:29:34
  • 转载请务必保留本文链接:https://go.coder-hub.com/75599900.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定