2020年5月30日 14:53:13go评论152阅读模式

英文:

Elasticsearch : "failed to get node info for {IP}" and "noNodeAvailableException" in service log

问题

以下是翻译好的内容：

我面临一个之前没有遇到过的问题。

我附上了我的服务和 Elasticsearch（2.4.4）的日志：

2020-05-30 06:29:44.576  INFO 24787 --- [generic][T#287]] org.elasticsearch.client.transport       : [Shatter] 无法获取节点 {#transport#-1}{172.17.0.1}{172.17.0.1:9300} 的信息，正在断开连接...

org.elasticsearch.transport.ReceiveTimeoutTransportException: [][172.17.0.1:9300][cluster:monitor/nodes/liveness] 请求ID [10242] 在 [5000毫秒] 后超时
        at org.elasticsearch.transport.TransportService$TimeoutHandler.run(TransportService.java:698) ~[elasticsearch-2.4.4.jar!/:2.4.4]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [na:1.8.0_242]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [na:1.8.0_242]
        at java.lang.Thread.run(Thread.java:748) [na:1.8.0_242]

Elasticsearch 日志：

[2020-05-30 06:29:46,784][INFO ][monitor.jvm              ] [Tempo] [gc][old][230125][41498] 持续时间 [8.2秒]，收集 [1]/[9秒]，总计 [8.2秒]/[10.7小时]，内存 [473.2MB] -> [426.1MB]/[494.9MB]，所有池 {[young] [131.8MB] -> [84.7MB]/[136.5MB]}{[survivor] [0字节] -> [0字节]/[17MB]}{[old] [341.3MB] -> [341.3MB]/[341.3MB]}
[2020-05-30 06:33:47,782][INFO ][monitor.jvm              ] [Tempo] [gc][old][230340][41540] 持续时间 [7秒]，收集 [1]/[7.8秒]，总计 [7秒]/[10.7小时]，内存 [493.3MB] -> [425MB]/[494.9MB]，所有池 {[young] [136.5MB] -> [83.6MB]/[136.5MB]}{[survivor] [15.4MB] -> [0字节]/[17MB]}{[old] [341.3MB] -> [341.3MB]/[341.3MB]}
[2020-05-30 06:37:59,384][INFO ][monitor.jvm              ] [Tempo] [gc][old][230569][41582] 持续时间 [6.9秒]，收集 [1]/[7.2秒]，总计 [6.9秒]/[10.7小时]，内存 [494.8MB] -> [424.7MB]/[494.9MB]，所有池 {[young] [136.5MB] -> [83.4MB]/[136.5MB]}{[survivor] [16.9MB] -> [0字节]/[17MB]}{[old] [341.3MB] -> [341.3MB]/[341.3MB]}

在我的开发环境中我没有遇到这个问题，但是当我部署在 EC2 上时就会出现这个问题。此外，当我重新启动 Elasticsearch 时，它能够正常运行，没有问题，但在经过 10-15 分钟，或者根据搜索查询或插入查询的数量，错误消息就会出现。

另外，我的实例存储空间已使用 74%，总共 94G，120G中。这可能是内存问题吗？我相当确定我的 res-client 代码是没有问题的，因为它在生产环境中已经运行了很长时间。这可能是端口问题吗？我在 Docker 容器中使用了 ElasticSearch。

非常感谢任何帮助。

_cat/fielddata?v

_cat/nodes?v

英文:

I am facing an issue which i wasn't earlier.

I am attaching logs of my service and elasticSearch (2.4.4):

2020-05-30 06:29:44.576  INFO 24787 --- [generic][T#287]] org.elasticsearch.client.transport       : [Shatter] failed to get node info for {#transport#-1}{172.17.0.1}{172.17.0.1:9300}, disc
onnecting...

org.elasticsearch.transport.ReceiveTimeoutTransportException: [][172.17.0.1:9300][cluster:monitor/nodes/liveness] request_id [10242] timed out after [5000ms]
        at org.elasticsearch.transport.TransportService$TimeoutHandler.run(TransportService.java:698) ~[elasticsearch-2.4.4.jar!/:2.4.4]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [na:1.8.0_242]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [na:1.8.0_242]
        at java.lang.Thread.run(Thread.java:748) [na:1.8.0_242]

ElasticSearch Logs:

[2020-05-30 06:29:46,784][INFO ][monitor.jvm              ] [Tempo] [gc][old][230125][41498] duration [8.2s], collections [1]/[9s], total [8.2s]/[10.7h], memory [473.2mb]-&gt;[426.1mb]/[494.9mb], all_pools {[young] [131.8mb]-&gt;[84.7mb]/[136.5mb]}{[survivor] [0b]-&gt;[0b]/[17mb]}{[old] [341.3mb]-&gt;[341.3mb]/[341.3mb]}
[2020-05-30 06:33:47,782][INFO ][monitor.jvm              ] [Tempo] [gc][old][230340][41540] duration [7s], collections [1]/[7.8s], total [7s]/[10.7h], memory [493.3mb]-&gt;[425mb]/[494.9mb], all_pools {[young] [136.5mb]-&gt;[83.6mb]/[136.5mb]}{[survivor] [15.4mb]-&gt;[0b]/[17mb]}{[old] [341.3mb]-&gt;[341.3mb]/[341.3mb]}
[2020-05-30 06:37:59,384][INFO ][monitor.jvm              ] [Tempo] [gc][old][230569][41582] duration [6.9s], collections [1]/[7.2s], total [6.9s]/[10.7h], memory [494.8mb]-&gt;[424.7mb]/[494.9mb], all_pools {[young] [136.5mb]-&gt;[83.4mb]/[136.5mb]}{[survivor] [16.9mb]-&gt;[0b]/[17mb]}{[old] [341.3mb]-&gt;[341.3mb]/[341.3mb]}

i am not facing the issue in my Development environment however when i deploy on EC2 i am getting this. Adding further when i do a restart of elastic. It works absolutely fine with no issues but after 10-15 mins or less depending on the amount for search queries or insertion queries, the error message appears.

Also, my storage space on the instance is 74% consumed 94G out of 120G.
can it be because of memory ?
I am pretty much sure my res-client code is fine as its working in production now for a long time.
Can it be a Port issue ? I am using docker container for elastic.

Any help will be appreciated.

_cat/fielddata?v

_cat/nodes?v

答案1

得分: 1

我认为你的Elasticsearch堆大小非常低。我最好的猜测是通过增加堆大小，问题将得到解决。
关于为什么现在会出现这种情况，我认为是因为随着时间的推移，数据量增加了。

我第二个猜测是负载过高。似乎最近对Elasticsearch的请求过多。你可以通过 /_cat/thread_pool?v 来检查请求队列的大小。
针对这种情况，你有两个解决方案。第一，减少请求量。第二，添加一个节点并添加副本。

英文:

I think your heap size for elasticsearch is very low. my best guess is with increasing the heap size the problem will be solved.
To ask why this has happened now, I think it's because the volume of data has increased over time.

my second guess is about high load. It seems that you have too many request to elasticsearch recently. you can check the size of queue request via /_cat/thread_pool?v.
you have two solution for this situation. first decrease the request. second add a node and add replica.

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

Elasticsearch : "failed to get node info for {IP}" and "noNodeAvailableException" in service log

问题

答案1

gRPC Kotlin Codegen插件为Protobuf编译器生成的代码只包括一个类。

继承相同类的所有对象的方法？

为什么在Intellij中通过Maven Artifact搜索添加依赖无法正常工作？

无法在IntelliJ IDEA中使用外部Java库。

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论