ActiveMQ Artemis HA共享存储中出现分裂脑的可能性?

huangapple go评论56阅读模式
英文:

Possibility of Split brain in ActiveMQ Artemis HA shared storage?

问题

在Artemis HA共享存储部署中,出现拆分脑的可能性是什么?ActiveMQ Artemis 2.17.0被部署为具有共享存储的HA主/备份,在AWS EFS中。在artemis.log中检查的特定日志语句有哪些?

主集群配置

    <connectors>
       <connector name="artemis">tcp://<master_ip>:61616</connector>
       <connector name="discovery-connector">tcp://<slave_ip>:61616</connector>
    </connectors>
    
    <cluster-connections>
       <cluster-connection name="artemis_cluster_configuration">
          <connector-ref>artemis</connector-ref>
          <message-load-balancing>ON_DEMAND</message-load-balancing>
    			
          <max-hops>1</max-hops>
          <static-connectors>   
             <connector-ref>discovery-connector</connector-ref> 
          </static-connectors> 
       </cluster-connection>
    </cluster-connections>
    
    
    <ha-policy>
       <shared-store>
          <master>
             <failover-on-shutdown>true</failover-on-shutdown>
          </master>
       </shared-store>
    </ha-policy>

备份集群配置

    <connectors>
       <connector name="artemis">tcp://<slave_ip>:61616</connector>
       <connector name="discovery-connector">tcp://<master_ip>:61616</connector>
    </connectors>
    
    <cluster-connections>
       <cluster-connection name="artemis_cluster_configuration">
          <connector-ref>artemis</connector-ref>
          <message-load-balancing>ON_DEMAND</message-load-balancing>
          <max-hops>1</max-hops>
          <static-connectors>   
             <connector-ref>discovery-connector</connector-ref> 
          </static-connectors> 
       </cluster-connection>
    </cluster-connections>
    
    
    <ha-policy>
       <shared-store>
          <slave>
             <failover-on-shutdown>true</failover-on-shutdown>
             <allow-failback>true</allow-failback>
          </slave>
       </shared-store>
    </ha-policy>

以上是您提供的Artemis HA共享存储部署的主集群和备份集群配置信息。请告知如果您需要进一步的信息或翻译。

英文:

What are the possibilities of split brain in Artemis HA shared storage deployment? ActiveMQ Artemis 2.17.0 is deployed as HA active/passive with shared storage in AWS EFS. Any specific log statements to check in artemis.log?

master cluster configuration

    &lt;connectors&gt;
       &lt;connector name=&quot;artemis&quot;&gt;tcp://&lt;master_ip&gt;:61616&lt;/connector&gt;
       &lt;connector name=&quot;discovery-connector&quot;&gt;tcp://&lt;slave_ip&gt;:61616&lt;/connector&gt;
    &lt;/connectors&gt;
    
    &lt;cluster-connections&gt;
       &lt;cluster-connection name=&quot;artemis_cluster_configuration&quot;&gt;
          &lt;connector-ref&gt;artemis&lt;/connector-ref&gt;
          &lt;message-load-balancing&gt;ON_DEMAND&lt;/message-load-balancing&gt;
    			
          &lt;max-hops&gt;1&lt;/max-hops&gt;
          &lt;static-connectors&gt;   
             &lt;connector-ref&gt;discovery-connector&lt;/connector-ref&gt; 
          &lt;/static-connectors&gt; 
       &lt;/cluster-connection&gt;
    &lt;/cluster-connections&gt;
    
    
    &lt;ha-policy&gt;
       &lt;shared-store&gt;
          &lt;master&gt;
             &lt;failover-on-shutdown&gt;true&lt;/failover-on-shutdown&gt;
          &lt;/master&gt;
       &lt;/shared-store&gt;
    &lt;/ha-policy&gt;

slave cluster configuration

    &lt;connectors&gt;
       &lt;connector name=&quot;artemis&quot;&gt;tcp://&lt;slave_ip&gt;:61616&lt;/connector&gt;
       &lt;connector name=&quot;discovery-connector&quot;&gt;tcp://&lt;master_ip&gt;:61616&lt;/connector&gt;
    &lt;/connectors&gt;
    
    &lt;cluster-connections&gt;
       &lt;cluster-connection name=&quot;artemis_cluster_configuration&quot;&gt;
          &lt;connector-ref&gt;artemis&lt;/connector-ref&gt;
          &lt;message-load-balancing&gt;ON_DEMAND&lt;/message-load-balancing&gt;
          &lt;max-hops&gt;1&lt;/max-hops&gt;
          &lt;static-connectors&gt;   
             &lt;connector-ref&gt;discovery-connector&lt;/connector-ref&gt; 
          &lt;/static-connectors&gt; 
       &lt;/cluster-connection&gt;
    &lt;/cluster-connections&gt;
    
    
    &lt;ha-policy&gt;
       &lt;shared-store&gt;
          &lt;slave&gt;
             &lt;failover-on-shutdown&gt;true&lt;/failover-on-shutdown&gt;
             &lt;allow-failback&gt;true&lt;/allow-failback&gt;
          &lt;/slave&gt;
       &lt;/shared-store&gt;
    &lt;/ha-policy&gt;

答案1

得分: 0

通常情况下,共享存储对抗拆分脑是具有弹性的。我相信与共享存储和拆分脑相关的唯一问题,自2.17.0以来已经修复,是ARTEMIS-4143,它涉及到主代理从共享存储断开连接,然后在备份已经变为活动状态之后重新连接的情况。

如果您在broker.xml中使用discovery-group,那么如果遇到拆分脑,您可能会看到一个带有AMQ212034代码的WARN日志消息,其内容如下:

There are more than one servers on the network broadcasting the same node id. You will see this message exactly once (per node) if a node is restarted, in which case it can be safely ignored. But if it is logged continuously it means you really do have more than one node on the same network active concurrently with the same node id. This could occur if you have a backup node active at the same time as its live node. nodeID={}

尽管如此,我不确定AWS EFS的锁定语义。ActiveMQ Artemis共享存储是设计运行在支持独占文件锁(例如NFSv4)的SAN或NAS文件系统上的。如果AWS EFS支持这一点,那么应该是可以的。否则,它将无法正常工作,并且两个代理很可能会同时处于活动状态(即遇到拆分脑)。

英文:

Generally speaking, shared storage is resilient against split-brain. I believe the only issue related to shared storage and split-brain which has been fixed since 2.17.0 is ARTEMIS-4143 which deals with the primary broker becoming disconnected from the shared storage and then reconnecting after the backup has already become active.

If you are using a discovery-group in your broker.xml then if you encounter split-brain you'll likely see a WARN log message with a code of AMQ212034 that says:

There are more than one servers on the network broadcasting the same node id. You will see this message exactly once (per node) if a node is restarted, in which case it can be safely ignored. But if it is logged continuously it means you really do have more than one node on the same network active concurrently with the same node id. This could occur if you have a backup node active at the same time as its live node. nodeID={}

That said, I'm not certain about the locking semantics of AWS EFS. ActiveMQ Artemis shared storage was designed to run on a SAN or NAS filesystem that supports exclusive file locks (e.g. NFSv4). If AWS EFS supports that then it should be fine. Otherwise it won't work properly and both brokers are likely to be active simultaneously (i.e. encounter split-brain).

huangapple
  • 本文由 发表于 2023年8月10日 22:11:20
  • 转载请务必保留本文链接:https://go.coder-hub.com/76876542.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定