2023年8月10日 22:11:20go评论86阅读模式

英文:

Possibility of Split brain in ActiveMQ Artemis HA shared storage?

问题

在Artemis HA共享存储部署中，出现拆分脑的可能性是什么？ActiveMQ Artemis 2.17.0被部署为具有共享存储的HA主/备份，在AWS EFS中。在artemis.log中检查的特定日志语句有哪些？

主集群配置

    <connectors>
       <connector name="artemis">tcp://<master_ip>:61616</connector>
       <connector name="discovery-connector">tcp://<slave_ip>:61616</connector>
    </connectors>
    
    <cluster-connections>
       <cluster-connection name="artemis_cluster_configuration">
          <connector-ref>artemis</connector-ref>
          <message-load-balancing>ON_DEMAND</message-load-balancing>
    			
          <max-hops>1</max-hops>
          <static-connectors>   
             <connector-ref>discovery-connector</connector-ref> 
          </static-connectors> 
       </cluster-connection>
    </cluster-connections>
    
    
    <ha-policy>
       <shared-store>
          <master>
             <failover-on-shutdown>true</failover-on-shutdown>
          </master>
       </shared-store>
    </ha-policy>

备份集群配置

    <connectors>
       <connector name="artemis">tcp://<slave_ip>:61616</connector>
       <connector name="discovery-connector">tcp://<master_ip>:61616</connector>
    </connectors>
    
    <cluster-connections>
       <cluster-connection name="artemis_cluster_configuration">
          <connector-ref>artemis</connector-ref>
          <message-load-balancing>ON_DEMAND</message-load-balancing>
          <max-hops>1</max-hops>
          <static-connectors>   
             <connector-ref>discovery-connector</connector-ref> 
          </static-connectors> 
       </cluster-connection>
    </cluster-connections>
    
    
    <ha-policy>
       <shared-store>
          <slave>
             <failover-on-shutdown>true</failover-on-shutdown>
             <allow-failback>true</allow-failback>
          </slave>
       </shared-store>
    </ha-policy>

以上是您提供的Artemis HA共享存储部署的主集群和备份集群配置信息。请告知如果您需要进一步的信息或翻译。

英文:

What are the possibilities of split brain in Artemis HA shared storage deployment? ActiveMQ Artemis 2.17.0 is deployed as HA active/passive with shared storage in AWS EFS. Any specific log statements to check in artemis.log?

master cluster configuration

    &lt;connectors&gt;
       &lt;connector name=&quot;artemis&quot;&gt;tcp://&lt;master_ip&gt;:61616&lt;/connector&gt;
       &lt;connector name=&quot;discovery-connector&quot;&gt;tcp://&lt;slave_ip&gt;:61616&lt;/connector&gt;
    &lt;/connectors&gt;
    
    &lt;cluster-connections&gt;
       &lt;cluster-connection name=&quot;artemis_cluster_configuration&quot;&gt;
          &lt;connector-ref&gt;artemis&lt;/connector-ref&gt;
          &lt;message-load-balancing&gt;ON_DEMAND&lt;/message-load-balancing&gt;
    			
          &lt;max-hops&gt;1&lt;/max-hops&gt;
          &lt;static-connectors&gt;   
             &lt;connector-ref&gt;discovery-connector&lt;/connector-ref&gt; 
          &lt;/static-connectors&gt; 
       &lt;/cluster-connection&gt;
    &lt;/cluster-connections&gt;
    
    
    &lt;ha-policy&gt;
       &lt;shared-store&gt;
          &lt;master&gt;
             &lt;failover-on-shutdown&gt;true&lt;/failover-on-shutdown&gt;
          &lt;/master&gt;
       &lt;/shared-store&gt;
    &lt;/ha-policy&gt;

slave cluster configuration

    &lt;connectors&gt;
       &lt;connector name=&quot;artemis&quot;&gt;tcp://&lt;slave_ip&gt;:61616&lt;/connector&gt;
       &lt;connector name=&quot;discovery-connector&quot;&gt;tcp://&lt;master_ip&gt;:61616&lt;/connector&gt;
    &lt;/connectors&gt;
    
    &lt;cluster-connections&gt;
       &lt;cluster-connection name=&quot;artemis_cluster_configuration&quot;&gt;
          &lt;connector-ref&gt;artemis&lt;/connector-ref&gt;
          &lt;message-load-balancing&gt;ON_DEMAND&lt;/message-load-balancing&gt;
          &lt;max-hops&gt;1&lt;/max-hops&gt;
          &lt;static-connectors&gt;   
             &lt;connector-ref&gt;discovery-connector&lt;/connector-ref&gt; 
          &lt;/static-connectors&gt; 
       &lt;/cluster-connection&gt;
    &lt;/cluster-connections&gt;
    
    
    &lt;ha-policy&gt;
       &lt;shared-store&gt;
          &lt;slave&gt;
             &lt;failover-on-shutdown&gt;true&lt;/failover-on-shutdown&gt;
             &lt;allow-failback&gt;true&lt;/allow-failback&gt;
          &lt;/slave&gt;
       &lt;/shared-store&gt;
    &lt;/ha-policy&gt;

答案1

得分: 0

通常情况下，共享存储对抗拆分脑是具有弹性的。我相信与共享存储和拆分脑相关的唯一问题，自2.17.0以来已经修复，是ARTEMIS-4143，它涉及到主代理从共享存储断开连接，然后在备份已经变为活动状态之后重新连接的情况。

如果您在broker.xml中使用discovery-group，那么如果遇到拆分脑，您可能会看到一个带有AMQ212034代码的WARN日志消息，其内容如下：

There are more than one servers on the network broadcasting the same node id. You will see this message exactly once (per node) if a node is restarted, in which case it can be safely ignored. But if it is logged continuously it means you really do have more than one node on the same network active concurrently with the same node id. This could occur if you have a backup node active at the same time as its live node. nodeID={}

尽管如此，我不确定AWS EFS的锁定语义。ActiveMQ Artemis共享存储是设计运行在支持独占文件锁（例如NFSv4）的SAN或NAS文件系统上的。如果AWS EFS支持这一点，那么应该是可以的。否则，它将无法正常工作，并且两个代理很可能会同时处于活动状态（即遇到拆分脑）。

英文:

Generally speaking, shared storage is resilient against split-brain. I believe the only issue related to shared storage and split-brain which has been fixed since 2.17.0 is ARTEMIS-4143 which deals with the primary broker becoming disconnected from the shared storage and then reconnecting after the backup has already become active.

If you are using a discovery-group in your broker.xml then if you encounter split-brain you'll likely see a WARN log message with a code of AMQ212034 that says:

There are more than one servers on the network broadcasting the same node id. You will see this message exactly once (per node) if a node is restarted, in which case it can be safely ignored. But if it is logged continuously it means you really do have more than one node on the same network active concurrently with the same node id. This could occur if you have a backup node active at the same time as its live node. nodeID={}

That said, I'm not certain about the locking semantics of AWS EFS. ActiveMQ Artemis shared storage was designed to run on a SAN or NAS filesystem that supports exclusive file locks (e.g. NFSv4). If AWS EFS supports that then it should be fine. Otherwise it won't work properly and both brokers are likely to be active simultaneously (i.e. encounter split-brain).

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

ActiveMQ Artemis HA共享存储中出现分裂脑的可能性？

问题

答案1

ActiveMQ Artemis何时清理组ID映射

How to create a durable consumer (subscriber) in Masstransit using ActiveMQ as transport?

User: Bob does not have permission='CREATE_DURABLE_QUEUE' for queue bob.test/test/signal/abc on address test/signal/abc

连接尝试连接部署在Openshift上的ActiveMQ Artemis时连接被拒绝。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。