ActiveMQ Artemis 2.27.1 $.artemis.internal.sf 大消息正在积累。

huangapple go评论51阅读模式
英文:

ActiveMQ Artemis 2.27.1 $.artemis.internal.sf large messages are building up

问题

以下是您提供的内容的中文翻译:

运行3个ActiveMQ Artemis 2.27.1节点:

从图中我们可以看到集群已经形成:

这是第2个节点视图,我不确定,但是那些正在传递的消息似乎被阻塞。
第3个节点一直在分页,到现在为止磁盘已满:

所以现在地址被阻止

2023年5月11日09:39:54,743 INFO [org.apache.activemq.artemis.core.server] AMQ224108:已停止在地址'domain-events...'上进行分页;大小=92233字节(5条消息);maxSize=-1字节(-1条消息);globalSize=434274938字节(44134条消息);globalMaxSize=1610612736字节(-1条消息);

这是集群连接的配置:

我认为问题只发生在大消息和负载下,但不确定。即使在负载停止后,它也不会恢复。不确定是配置错误还是某个错误。

这可能与此问题有关:

更新:

使用2.28.0仍然会有问题:

日志:

...

似乎无法找到“旧”节点(UUID是否重新生成?),因为集群仍然形成。

英文:

Running 3 nodes of ActiveMQ Artemis 2.27.1:

ActiveMQ Artemis 2.27.1 $.artemis.internal.sf 大消息正在积累。

From the diagram we can see the cluster is formed:

ActiveMQ Artemis 2.27.1 $.artemis.internal.sf 大消息正在积累。

This is node 2 view and I'm not sure but those delivering messages seem block.
Node 3 started paging till now disk is full:

ActiveMQ Artemis 2.27.1 $.artemis.internal.sf 大消息正在积累。

So now address is blocked

2023-05-11 09:39:54,743 INFO  [org.apache.activemq.artemis.core.server] AMQ224108: Stopped paging on address 'domain-events...'; size=92233 bytes (5 messages); maxSize=-1 bytes (-1 messages); globalSize=434274938 bytes (44134 messages); globalMaxSize=1610612736 bytes (-1 messages);

Here is the configuration of the cluster connection:

      <cluster-connections>
        <cluster-connection name="artemis-cluster">
          <address></address>
          <connector-ref>artemis</connector-ref>
          <check-period>1000</check-period>
          <connection-ttl>5000</connection-ttl>
          <min-large-message-size>50000</min-large-message-size>
          <call-timeout>5000</call-timeout>
          <retry-interval>500</retry-interval>
          <retry-interval-multiplier>2.0</retry-interval-multiplier>
          <max-retry-interval>5000</max-retry-interval>
          <initial-connect-attempts>-1</initial-connect-attempts>
          <reconnect-attempts>-1</reconnect-attempts>
          <use-duplicate-detection>true</use-duplicate-detection>
          <forward-when-no-consumers>false</forward-when-no-consumers>
          <message-load-balancing>ON_DEMAND</message-load-balancing>
          <max-hops>1</max-hops>
          <confirmation-window-size>32000</confirmation-window-size>
          <producer-window-size>-1</producer-window-size>
          <call-failover-timeout>30000</call-failover-timeout>
          <notification-interval>1000</notification-interval>
          <notification-attempts>2</notification-attempts>
          <discovery-group-ref discovery-group-name="artemis-discovery-group"/>
        </cluster-connection>
      </cluster-connections>

I think the problem only happens with large messages and under load, but not totally sure. Even after load stops it does not recover. Not sure if it's a misconfiguration or some bug.

This might be related with this question

UPDATE:

With 2.28.0 still builds up:

ActiveMQ Artemis 2.27.1 $.artemis.internal.sf 大消息正在积累。

Logs:

.4.61.133:41742 within the 5000ms connection TTL. The connection will now be closed. [code=CONNECTION_TIMEDOUT]
2023-05-13 22:05:35,120 WARN  [org.apache.activemq.artemis.core.server] AMQ222061: Client connection failed, clearing up resources for session 85fbf78c-f17a-11ed-b23e-005056a19fba
2023-05-13 22:05:35,616 WARN  [org.apache.activemq.artemis.core.server] AMQ222107: Cleared up resources for session 85fbf78c-f17a-11ed-b23e-005056a19fba
2023-05-13 22:05:35,616 WARN  [org.apache.activemq.artemis.core.server] AMQ222061: Client connection failed, clearing up resources for session 85fe8f9d-f17a-11ed-b23e-005056a19fba
2023-05-13 22:05:41,175 WARN  [org.apache.activemq.artemis.core.server] AMQ222107: Cleared up resources for session 85fe8f9d-f17a-11ed-b23e-005056a19fba
2023-05-13 22:05:40,716 ERROR [org.apache.activemq.artemis.core.server] AMQ224088: Timeout (10 seconds) on acceptor "artemis" during protocol handshake with /10.4.61.132:46738 has occurred.
2023-05-13 22:05:51,426 ERROR [org.apache.activemq.artemis.core.server] AMQ224088: Timeout (10 seconds) on acceptor "artemis" during protocol handshake with /10.4.61.132:46740 has occurred.
2023-05-13 22:05:44,460 WARN  [org.apache.activemq.artemis.core.client] AMQ212057: Large Message Streaming is taking too long to flush on back pressure.
2023-05-13 22:05:43,052 ERROR [org.apache.activemq.artemis.core.server] AMQ224088: Timeout (10 seconds) on acceptor "artemis" during protocol handshake with /10.4.61.133:46682 has occurred.
2023-05-13 22:05:43,052 WARN  [org.apache.activemq.artemis.core.server] AMQ224089: Failed to calculate persistent size
java.lang.OutOfMemoryError: Java heap space
2023-05-13 22:05:59,050 WARN  [org.apache.activemq.artemis.core.client] AMQ212037: Connection failure to /10.4.61.133:41740 has been detected: AMQ229014: Did not receive data from /10.4.61.133:41740 within the 5000ms connection TTL. The connection will now be closed. [code=CONNECTION_TIMEDOUT]
2023-05-13 22:06:03,245 ERROR [org.apache.activemq.artemis.core.server] AMQ224088: Timeout (10 seconds) on acceptor "artemis" during protocol handshake with /10.4.61.132:46746 has occurred.
2023-05-13 22:06:03,245 WARN  [org.apache.activemq.artemis.core.server] AMQ222225: Sending unexpected exception to the client
java.lang.OutOfMemoryError: Java heap space
2023-05-13 22:06:06,844 WARN  [org.apache.activemq.artemis.core.client] AMQ212037: Connection failure to /10.4.61.132:41504 has been detected: AMQ229014: Did not receive data from /10.4.61.132:41504 within the 5000ms connection TTL. The connection will now be closed. [code=CONNECTION_TIMEDOUT]
2023-05-13 22:06:06,844 WARN  [org.apache.activemq.artemis.core.server] AMQ222061: Client connection failed, clearing up resources for session b48ce589-f17a-11ed-8430-005056a12ace
2023-05-13 22:06:06,844 WARN  [org.apache.activemq.artemis.core.server] AMQ222107: Cleared up resources for session b48ce589-f17a-11ed-8430-005056a12ace
2023-05-13 22:06:06,844 WARN  [org.apache.activemq.artemis.core.server] AMQ222061: Client connection failed, clearing up resources for session b492d8fa-f17a-11ed-8430-005056a12ace
2023-05-13 22:06:07,320 WARN  [org.apache.activemq.artemis.core.server] AMQ222095: Connection failed with failedOver=false
2023-05-13 22:06:07,777 WARN  [org.apache.activemq.artemis.core.server] AMQ222107: Cleared up resources for session b492d8fa-f17a-11ed-8430-005056a12ace
2023-05-13 22:06:09,089 WARN  [org.eclipse.jetty.server.HttpChannel] /metrics/
java.lang.OutOfMemoryError: Java heap space
2023-05-13 22:06:07,777 WARN  [org.apache.activemq.artemis.core.client] AMQ212037: Connection failure to 01.qa-messaging.example.org/10.4.61.131:61616 has been detected: AMQ219011: Did not receive data from server for org.apache.activemq.artemis.core.remoting.impl.netty.NettyConnection@365ee10b[ID=97f67787, local= /10.4.61.131:52730, remote=01.qa-messaging.example.org/10.4.61.131:61616] [code=CONNECTION_TIMEDOUT]
2023-05-13 22:05:58,601 WARN  [org.apache.activemq.artemis.core.server] AMQ222094: Bridge unable to send message PagedReferenceImpl [message=PagedMessageImpl [queueIDs=[13], transactionID=-1, page=2020, message=CoreMessage[messageID=85540622192,durable=true,userID=25a127af-f141-11ed-9cc5-2a73978c3d8c,priority=4, timestamp=Sat May 13 05:49:22 CEST 2023,expiration=0, durable=true, address=domain-event.main.artemis-sqs.example.v1.event,size=101684,properties=TypedProperties[BusinessResourceType=OBJECT,BusinessEventVersion=v1,Address=domain-event.main.artemis-sqs.example.v1.event,Authorization={"principal":"anonymous","authorities":[{"authority":"ROLE_ANONYMOUS"}]},_AMQ_ROUTING_TYPE=0,Sqs_Msa_ApproximateReceiveCount=1,AuthorizationType=anonymous-v1.0,Sqs_Msa_SenderId=AIDAIP3MER2HFHNCCMVD4,_AMQ_ROUTE_TO$.artemis.internal.sf.artemis-cluster.c0c60e22-6d48-11eb-afe9-005056a19fba=[0000 0004 5526 D61B),bytesAsLongs(18608477723],__AMQ_CID=1f103ef5-f141-11ed-9cc5-2a73978c3d8c,SendOffsetDateTime=2023-05-13T03:49:22.454731605Z,Sqs_Msa_SentTimestamp=1683721338132,timestamp=1683949762454]]@985312265], deliveryTime=0, persistedCount=0, deliveryCount=0, subscription=PageSubscriptionImpl [cursorId=13, queue=QueueImpl[name=$.artemis.internal.sf.artemis-cluster.c0c60e22-6d48-11eb-afe9-005056a19fba, postOffice=PostOfficeImpl [server=ActiveMQServerImpl::name=01.qa-messaging.example.org], temp=false]@13317738, filter = null]], will try again once bridge reconnects
org.apache.activemq.artemis.api.core.ActiveMQException: Connection f6a6ea38 closed or disconnected
at org.apache.activemq.artemis.core.protocol.core.impl.ActiveMQSessionContext.sendSessionSendContinuationMessage(ActiveMQSessionContext.java:1106) ~[artemis-core-client-2.28.0.jar:2.28.0]
at org.apache.activemq.artemis.core.protocol.core.impl.ActiveMQSessionContext.sendLargeMessageChunk(ActiveMQSessionContext.java:607) ~[artemis-core-client-2.28.0.jar:2.28.0]
at org.apache.activemq.artemis.core.client.impl.ClientProducerImpl.largeMessageSendStreamed(ClientProducerImpl.java:507) ~[artemis-core-client-2.28.0.jar:2.28.0]
at org.apache.activemq.artemis.core.client.impl.ClientProducerImpl.largeMessageSendBuffered(ClientProducerImpl.java:416) ~[artemis-core-client-2.28.0.jar:2.28.0]
at org.apache.activemq.artemis.core.client.impl.ClientProducerImpl.largeMessageSend(ClientProducerImpl.java:345) ~[artemis-core-client-2.28.0.jar:2.28.0]
at org.apache.activemq.artemis.core.client.impl.ClientProducerImpl.doSend(ClientProducerImpl.java:275) ~[artemis-core-client-2.28.0.jar:2.28.0]
at org.apache.activemq.artemis.core.client.impl.ClientProducerImpl.send(ClientProducerImpl.java:147) ~[artemis-core-client-2.28.0.jar:2.28.0]
at org.apache.activemq.artemis.core.client.impl.ClientProducerImpl.send(ClientProducerImpl.java:129) ~[artemis-core-client-2.28.0.jar:2.28.0]
at org.apache.activemq.artemis.core.server.cluster.impl.BridgeImpl.deliverStandardMessage(BridgeImpl.java:748) ~[artemis-server-2.28.0.jar:2.28.0]
at org.apache.activemq.artemis.core.server.cluster.impl.BridgeImpl.handle(BridgeImpl.java:607) ~[artemis-server-2.28.0.jar:2.28.0]
at org.apache.activemq.artemis.core.server.impl.QueueImpl.handle(QueueImpl.java:3980) ~[artemis-server-2.28.0.jar:2.28.0]
at org.apache.activemq.artemis.core.server.impl.QueueImpl.deliver(QueueImpl.java:3127) ~[artemis-server-2.28.0.jar:2.28.0]
at org.apache.activemq.artemis.core.server.impl.QueueImpl$DeliverRunner.run(QueueImpl.java:4298) ~[artemis-server-2.28.0.jar:2.28.0]
at org.apache.activemq.artemis.utils.actors.OrderedExecutor.doTask(OrderedExecutor.java:57) ~[artemis-commons-2.28.0.jar:?]
at org.apache.activemq.artemis.utils.actors.OrderedExecutor.doTask(OrderedExecutor.java:32) ~[artemis-commons-2.28.0.jar:?]
at org.apache.activemq.artemis.utils.actors.ProcessorBase.executePendingTasks(ProcessorBase.java:68) ~[artemis-commons-2.28.0.jar:?]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) ~[?:?]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) ~[?:?]
at org.apache.activemq.artemis.utils.ActiveMQThreadFactory$1.run(ActiveMQThreadFactory.java:118) ~[artemis-commons-2.28.0.jar:?]
2023-05-13 22:06:11,850 WARN  [org.apache.activemq.artemis.core.server] AMQ222095: Connection failed with failedOver=false
2023-05-13 22:06:14,579 WARN  [org.apache.activemq.artemis.core.client] AMQ212037: Connection failure to /10.4.61.132:41506 has been detected: AMQ229014: Did not receive data from /10.4.61.132:41506 within the 5000ms connection TTL. The connection will now be closed. [code=CONNECTION_TIMEDOUT]

and:

2023-05-13 22:17:38,099 WARN  [org.apache.activemq.artemis.utils.actors.OrderedExecutor] Java heap space
java.lang.OutOfMemoryError: Java heap space
at io.netty.util.internal.PlatformDependent.allocateUninitializedArray(PlatformDependent.java:325) ~[netty-common-4.1.86.Final.jar:4.1.86.Final]
at io.netty.buffer.UnpooledUnsafeHeapByteBuf.allocateArray(UnpooledUnsafeHeapByteBuf.java:39) ~[netty-buffer-4.1.86.Final.jar:4.1.86.Final]
at io.netty.buffer.UnpooledByteBufAllocator$InstrumentedUnpooledUnsafeHeapByteBuf.allocateArray(UnpooledByteBufAllocator.java:144) ~[netty-buffer-4.1.86.Final.jar:4.1.86.Final]

and:

2023-05-13 22:18:21,543 ERROR [org.jgroups.protocols.UDP] JGRP000027: failed passing message up
java.lang.OutOfMemoryError: Java heap space
2023-05-13 22:18:24,295 WARN  [org.apache.activemq.artemis.core.paging.cursor.impl.PageSubscriptionImpl] Java heap space
java.lang.OutOfMemoryError: Java heap space
2023-05-13 22:18:24,868 ERROR [org.apache.activemq.artemis.core.server] AMQ222010: Critical IO Error, shutting down the server. file=Java heap space, message=NULL
java.lang.OutOfMemoryError: Java heap space
2023-05-13 22:18:24,877 WARN  [org.apache.activemq.artemis.core.client] AMQ212057: Large Message Streaming is taking too long to flush on back pressure.

The 'Java heap space' make me suspect lack of resources, but even without load it doesn't recover.
Changed -Xmx=6Gb.

On retryMessages():

...
2023-05-15 10:58:45,508 WARN  [org.apache.activemq.artemis.core.server] AMQ222188: Unable to find target queue for node null
2023-05-15 10:58:45,509 WARN  [org.apache.activemq.artemis.core.server] AMQ222188: Unable to find target queue for node null
2023-05-15 10:58:45,509 WARN  [org.apache.activemq.artemis.core.server] AMQ222188: Unable to find target queue for node null
2023-05-15 10:58:46,467 WARN  [org.apache.activemq.artemis.core.client] AMQ212037: Connection failure to /10.4.61.133:36734 has been detected: AMQ229014: Did not receive data from /10.4.61.133:36734 within the 5000ms connection TTL. The connection will now be closed. [code=CONNECTION_TIMEDOUT]
2023-05-15 10:58:49,296 WARN  [org.apache.activemq.artemis.core.server] AMQ222061: Client connection failed, clearing up resources for session 6da3c37a-f2fd-11ed-9dca-2e3f529a8ed8
2023-05-15 10:58:49,296 WARN  [org.apache.activemq.artemis.core.server] AMQ222107: Cleared up resources for session 6da3c37a-f2fd-11ed-9dca-2e3f529a8ed8
2023-05-15 10:58:49,296 WARN  [org.apache.activemq.artemis.core.server] AMQ222061: Client connection failed, clearing up resources for session 6da87e6b-f2fd-11ed-9dca-2e3f529a8ed8
2023-05-15 10:58:49,297 WARN  [org.apache.activemq.artemis.core.server] AMQ222107: Cleared up resources for session 6da87e6b-f2fd-11ed-9dca-2e3f529a8ed8
2023-05-15 10:58:49,297 WARN  [org.apache.activemq.artemis.core.server] AMQ222061: Client connection failed, clearing up resources for session 965c4dec-f2fe-11ed-9dca-2e3f529a8ed8
2023-05-15 10:58:49,297 WARN  [org.apache.activemq.artemis.core.server] AMQ222107: Cleared up resources for session 965c4dec-f2fe-11ed-9dca-2e3f529a8ed8
2023-05-15 10:58:53,924 WARN  [org.apache.activemq.artemis.core.server] AMQ222095: Connection failed with failedOver=false
2023-05-15 10:58:54,842 ERROR [org.apache.activemq.artemis.core.server] AMQ224088: Timeout (10 seconds) on acceptor "artemis" during protocol handshake with /10.4.61.132:38842 has occurred.
2023-05-15 10:58:54,842 ERROR [org.apache.activemq.artemis.core.server] AMQ224088: Timeout (10 seconds) on acceptor "artemis" during protocol handshake with /10.4.61.132:38844 has occurred.
2023-05-15 10:58:55,781 ERROR [org.apache.activemq.artemis.core.server] AMQ224088: Timeout (10 seconds) on acceptor "artemis" during protocol handshake with /10.4.61.132:38838 has occurred.
2023-05-15 10:58:55,781 ERROR [org.apache.activemq.artemis.core.server] AMQ224088: Timeout (10 seconds) on acceptor "artemis" during protocol handshake with /10.4.61.133:36816 has occurred.
2023-05-15 10:58:55,782 WARN  [org.apache.activemq.artemis.core.client] AMQ212037: Connection failure to /10.4.61.132:38736 has been detected: AMQ229014: Did not receive data from /10.4.61.132:38736 within the 5000ms connection TTL. The connection will now be closed. [code=CONNECTION_TIMEDOUT]
2023-05-15 10:58:57,661 ERROR [org.apache.activemq.artemis.core.server] AMQ224088: Timeout (10 seconds) on acceptor "artemis" during protocol handshake with /10.4.61.132:38850 has occurred.
2023-05-15 10:58:59,537 ERROR [org.apache.activemq.artemis.core.server] AMQ224088: Timeout (10 seconds) on acceptor "artemis" during protocol handshake with /10.4.61.133:36836 has occurred.
2023-05-15 10:59:00,448 WARN  [org.apache.activemq.artemis.core.client] AMQ212041: Timed out waiting for netty channel to close
2023-05-15 10:59:02,269 WARN  [org.apache.activemq.artemis.core.client] AMQ212037: Connection failure to /10.4.61.133:36736 has been detected: AMQ229014: Did not receive data from /10.4.61.133:36736 within the 5000ms connection TTL. The connection will now be closed. [code=CONNECTION_TIMEDOUT]
2023-05-15 10:59:02,269 WARN  [org.apache.activemq.artemis.core.server] AMQ222061: Client connection failed, clearing up resources for session 776504e0-f2fd-11ed-8573-005056a19fba
2023-05-15 10:59:02,269 WARN  [org.apache.activemq.artemis.core.server] AMQ222107: Cleared up resources for session 776504e0-f2fd-11ed-8573-005056a19fba
2023-05-15 10:59:02,269 WARN  [org.apache.activemq.artemis.core.server] AMQ222061: Client connection failed, clearing up resources for session 7768fc81-f2fd-11ed-8573-005056a19fba
2023-05-15 10:59:02,269 WARN  [org.apache.activemq.artemis.core.server] AMQ222107: Cleared up resources for session 7768fc81-f2fd-11ed-8573-005056a19fba
2023-05-15 10:59:14,400 ERROR [org.apache.activemq.artemis.core.server] AMQ224088: Timeout (10 seconds) on acceptor "artemis" during protocol handshake with /10.4.61.133:36840 has occurred.
2023-05-15 10:59:14,401 WARN  [org.apache.activemq.artemis.core.client] AMQ212041: Timed out waiting for netty channel to close
2023-05-15 10:59:17,145 WARN  [org.apache.activemq.artemis.core.server] AMQ222061: Client connection failed, clearing up resources for session 6da01a8d-f2fd-11ed-bdeb-3ef0a8883d0e
2023-05-15 10:59:19,838 WARN  [org.apache.activemq.artemis.core.server] AMQ222107: Cleared up resources for session 6da01a8d-f2fd-11ed-bdeb-3ef0a8883d0e
2023-05-15 10:59:19,838 WARN  [org.apache.activemq.artemis.core.server] AMQ222061: Client connection failed, clearing up resources for session 6da4fc8e-f2fd-11ed-bdeb-3ef0a8883d0e

Seems like it cannot find the "old" node (does UUID gets regenerated?), as cluster still forms.

答案1

得分: 1

Updating to 2.28.0 可能会修复此问题,因为有很多与大消息相关的问题在 2.28.0 中已修复。

英文:

Updating to 2.28.0 could fix this issue because there are a lot of issues related to large messages that are fixed in 2.28.0.

huangapple
  • 本文由 发表于 2023年5月11日 16:28:10
  • 转载请务必保留本文链接:https://go.coder-hub.com/76225590.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定