英文:
SOLR Max requests queued per destination 3000 exceeded for HttpDestination + TIMED_WAITING
问题
我们使用SOLR(8.3.1)CLOUD(NRT)与Zookeeper Ensemble,每个节点在Centos虚拟机上有3个节点。
SOLR节点具有66GB RAM、15GB HEAP MEM和4个CPU。
记录计数:330万。平均文档大小为350Kb。
一切正常,直到群集发生某种干扰,由于负载或网络延迟问题。TIMED_WAITING中的线程增加到7000+,并在SOLR重新启动之前保持不变。
服务器1:
7722 线程处于TIMED_WAITING状态
("lock":"java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@151d5f2f
")
服务器2:
4046 线程处于TIMED_WAITING状态
("lock":"java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@1e0205c3
")
服务器3:
4210 线程处于TIMED_WAITING状态
("lock":"java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@5ee792c0
")
如何将3000增加到更大的值?net.ipv4.tcp_tw_reuse=1是否有帮助?有什么缺点?请帮助。
英文:
We are having SOLR (8.3.1) CLOUD (NRT) with Zookeeper Ensemble , 3 nodes
each on Centos VMs
SOLR Nodes has 66GB RAM, 15GB HEAP MEM, 4 CPUs.
Record Count: 3.3Million. Avg Doc Size is 350Kb.
Everything works fine until some disturbance happens with the cluser, due to load or network latancy issues. The threads in TIMED_WAITING increase to 7000+ and it stays until SOLR restart
Server 1:
7722 Threads are in TIMED_WATING
("lock":"java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@151d5f2f
")
Server 2:
4046 Threads are in TIMED_WATING
("lock":"java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@1e0205c3
")
Server 3:
4210 Threads are in TIMED_WATING
("lock":"java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@5ee792c0
")
How to increase the 3000 to something bigger? will net.ipv4.tcp_tw_reuse=1 helps? what is the drawback? Please help.
答案1
得分: 2
其中一个可能的解决方法是切换到 http1(solr 选项 -Dsolr.http1
).
英文:
One of possible workaround is switch to http1 (solr option -Dsolr.http1
)
答案2
得分: 0
验证系统时间/NTP同步是否在错误窗口期间。这可能是根本原因之一。还要关注客户的明确提交。
英文:
Validate System time/NTP Sync during error window. It might be one of the root cause. Also, watch for the explicit client's commits.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论