2023年5月14日 14:22:53go评论153阅读模式

英文:

Redis instance on docker disconnect once per few month randomly from ioredis, how to debug?

问题

I am using the ioredis node library with Redis configured in a Docker-compose file. The Redis server starts successfully, but after a few days, I encounter a "ECONNREFUSED" error when attempting to connect. Restarting the Docker containers resolves the issue temporarily, but I'm looking for a more elegant solution and understanding the root cause of the problem.

英文:

I am using node library ioredis. And have redis on docker-compose configured like this:

  redis:
    image: &quot;redis:latest&quot;
    volumes:
      - redis_data:/data

It is simplest possible config, so I hope nothing is broken here.

My connection is also simplest possible

import Redis from &quot;ioredis&quot;;

export const redis = new Redis(process.env.REDIS_URL ?? &#39;&#39;);

when I typing docker-compose up I can see logs

redis_1  | 1:C 09 Jan 2023 06:00:49.251 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
redis_1  | 1:C 09 Jan 2023 06:00:49.252 # Redis version=7.0.10, bits=64, commit=00000000, modified=0, pid=1, just started
redis_1  | 1:C 09 Jan 2023 06:00:49.252 # Warning: no config file specified, using the default config. In order to specify a config file use redis-server /path/to/redis.conf
redis_1  | 1:M 09 Jan 2023 06:00:49.254 * monotonic clock: POSIX clock_gettime
redis_1  | 1:M 09 Jan 2023 06:00:49.258 * Running mode=standalone, port=6379.
redis_1  | 1:M 09 Jan 2023 06:00:49.258 # Server initialized
redis_1  | 1:M 09 Jan 2023 06:00:49.259 # WARNING Memory overcommit must be enabled! Without it, a background save or replication may fail under low memory condition. Being disabled, it can can also cause failures without low memory condition, see https://github.com/jemalloc/jemalloc/issues/1328. To fix this issue add &#39;vm.overcommit_memory = 1&#39; to /etc/sysctl.conf and then reboot or run the command &#39;sysctl vm.overcommit_memory=1&#39; for this to take effect.
redis_1  | 1:M 09 Jan 2023 06:00:49.260 * Loading RDB produced by version 7.0.10
redis_1  | 1:M 09 Jan 2023 06:00:49.261 * RDB age 120617 seconds
redis_1  | 1:M 09 Jan 2023 06:00:49.261 * RDB memory usage when created 274.70 Mb
redis_1  | 1:M 09 Jan 2023 06:00:51.257 * Done loading RDB, keys loaded: 1201, keys expired: 0.
redis_1  | 1:M 09 Jan 2023 06:00:51.258 * DB loaded from disk: 1.998 seconds
redis_1  | 1:M 09 Jan 2023 06:00:51.259 * Ready to accept connections

then I see many days repeating

redis_1  | 1:M 09 May 2023 15:49:24.506 * 1 changes in 3600 seconds. Saving...
redis_1  | 1:M 09 May 2023 15:49:24.517 * Background saving started by pid 207
redis_1  | 207:C 09 May 2023 15:49:29.023 * DB saved on disk
redis_1  | 207:C 09 May 2023 15:49:29.025 * Fork CoW for RDB: current 1 MB, peak 1 MB, average 1 MB
redis_1  | 1:M 09 May 2023 15:49:29.094 * Background saving terminated with success
redis_1  | 1:M 09 May 2023 16:49:30.043 * 1 changes in 3600 seconds. Saving...
redis_1  | 1:M 09 May 2023 16:49:30.061 * Background saving started by pid 208
redis_1  | 208:C 09 May 2023 16:49:31.606 * DB saved on disk
redis_1  | 208:C 09 May 2023 16:49:31.608 * Fork CoW for RDB: current 1 MB, peak 1 MB, average 0 MB
redis_1  | 1:M 09 May 2023 16:49:31.666 * Background saving terminated with success

app operate normally and suddenly in app I can see logs

app_1    | [ioredis] Unhandled error event: Error: connect ECONNREFUSED 172.18.0.11:6379
app_1    |     at TCPConnectWrap.afterConnect [as oncomplete] (node:net:1494:16)
app_1    | [ioredis] Unhandled error event: Error: connect ECONNREFUSED 172.18.0.11:6379
app_1    |     at TCPConnectWrap.afterConnect [as oncomplete] (node:net:1494:16)
app_1    | [ioredis] Unhandled error event: Error: connect ECONNREFUSED 172.18.0.11:6379
app_1    |     at TCPConnectWrap.afterConnect [as oncomplete] (node:net:1494:16)
app_1    | finished in 1875996ms
app_1    | [ioredis] Unhandled error event: Error: getaddrinfo EAI_AGAIN redis
app_1    |     at GetAddrInfoReqWrap.onlookup [as oncomplete] (node:dns:107:26)
app_1    | [ioredis] Unhandled error event: Error: getaddrinfo EAI_AGAIN redis
app_1    |     at GetAddrInfoReqWrap.onlookup [as oncomplete] (node:dns:107:26)
app_1    | [ioredis] Unhandled error event: Error: getaddrinfo EAI_AGAIN redis
app_1    |     at GetAddrInfoReqWrap.onlookup [as oncomplete] (node:dns:107:26)
app_1    | [ioredis] Unhandled error event: Error: getaddrinfo EAI_AGAIN redis
app_1    |     at GetAddrInfoReqWrap.onlookup [as oncomplete] (node:dns:107:26)
app_1    | [ioredis] Unhandled error event: Error: getaddrinfo EAI_AGAIN redis
app_1    |     at GetAddrInfoReqWrap.onlookup [as oncomplete] (node:dns:107:26)
app_1    | [ioredis] Unhandled error event: Error: getaddrinfo EAI_AGAIN redis
app_1    |     at GetAddrInfoReqWrap.onlookup [as oncomplete] (node:dns:107:26)
app_1    | [ioredis] Unhandled error event: Error: getaddrinfo EAI_AGAIN redis
app_1    |     at GetAddrInfoReqWrap.onlookup [as oncomplete] (node:dns:107:26)
app_1    | [ioredis] Unhandled error event: Error: getaddrinfo EAI_AGAIN redis
app_1    |     at GetAddrInfoReqWrap.onlookup [as oncomplete] (node:dns:107:26)
app_1    | [ioredis] Unhandled error event: Error: getaddrinfo EAI_AGAIN redis
app_1    |     at GetAddrInfoReqWrap.onlookup [as oncomplete] (node:dns:107:26)
app_1    | [ioredis] Unhandled error event: Error: getaddrinfo EAI_AGAIN redis
app_1    |     at GetAddrInfoReqWrap.onlookup [as oncomplete] (node:dns:107:26)
app_1    | [ioredis] Unhandled error event: Error: getaddrinfo EAI_AGAIN redis
app_1    |     at GetAddrInfoReqWrap.onlookup [as oncomplete] (node:dns:107:26)
app_1    | [ioredis] Unhandled error event: Error: getaddrinfo EAI_AGAIN redis
app_1    |     at GetAddrInfoReqWrap.onlookup [as oncomplete] (node:dns:107:26)
app_1    | [ioredis] Unhandled error event: Error: getaddrinfo EAI_AGAIN redis
app_1    |     at GetAddrInfoReqWrap.onlookup [as oncomplete] (node:dns:107:26)
app_1    | [ioredis] Unhandled error event: Error: getaddrinfo EAI_AGAIN redis
app_1    |     at GetAddrInfoReqWrap.onlookup [as oncomplete] (node:dns:107:26)
app_1    | [ioredis] Unhandled error event: Error: getaddrinfo EAI_AGAIN redis
app_1    |     at GetAddrInfoReqWrap.onlookup [as oncomplete] (node:dns:107:26)
app_1    | /opt/app/node_modules/ioredis/built/redis/event_handler.js:182
app_1    |                     self.flushQueue(new errors_1.MaxRetriesPerRequestError(maxRetriesPerRequest));
app_1    |                                     ^
app_1    | 
app_1    | MaxRetriesPerRequestError: Reached the max retries per request limit (which is 20). Refer to &quot;maxRetriesPerRequest&quot; option for details.
app_1    |     at Socket.&lt;anonymous&gt; (/opt/app/node_modules/ioredis/built/redis/event_handler.js:182:37)
app_1    |     at Object.onceWrapper (node:events:628:26)
app_1    |     at Socket.emit (node:events:513:28)
app_1    |     at TCP.&lt;anonymous&gt; (node:net:322:12)

but 2 hours later there are produced new logs from redis showing that redis works

redis_1  | 1:M 09 May 2023 18:38:33.833 * 1 changes in 3600 seconds. Saving...
redis_1  | 1:M 09 May 2023 18:38:33.842 * Background saving started by pid 209
redis_1  | 209:C 09 May 2023 18:38:35.505 * DB saved on disk
redis_1  | 209:C 09 May 2023 18:38:35.506 * Fork CoW for RDB: current 1 MB, peak 1 MB, average 0 MB
redis_1  | 1:M 09 May 2023 18:38:35.553 * Background saving terminated with success
redis_1  | 1:M 09 May 2023 19:38:36.096 * 1 changes in 3600 seconds. Saving...
redis_1  | 1:M 09 May 2023 19:38:36.108 * Background saving started by pid 210
redis_1  | 210:C 09 May 2023 19:38:37.452 * DB saved on disk
redis_1  | 210:C 09 May 2023 19:38:37.454 * Fork CoW for RDB: current 1 MB, peak 1 MB, average 0 MB
redis_1  | 1:M 09 May 2023 19:38:37.512 * Background saving terminated with success
redis_1  | 1:M 10 May 2023 09:19:02.490 * 1 changes in 3600 seconds. Saving...
redis_1  | 1:M 10 May 2023 09:19:02.538 * Background saving started by pid 211
redis_1  | 211:C 10 May 2023 09:19:06.152 * DB saved on disk

My current strategy is:

ping server every few minutes checking if I can connect with redis, if no, then login to server and run

docker-compose down
docker-compose up

It always works perfectly, but I would like to fix this problem in more elegant way, and understand what is reason of this error.

I was able to reproduce this behaviour on few independent services that I am maintaining, but it is very hard to predict when error will occur.

答案1

得分: 0

Logs显示超时，但真正的原因是服务器上的内存有限，而这并没有被监控到。当我在事故发生后登录时，内存处于正常水平，但内存不足是这些问题的直接原因。

长期解决方案：

限制Redis容器可用的内存，并选择正确的清理策略。
设置内存监控以防止这种情况发生。

英文:

Logs shows timeouts but real reason was in limited memory on server, that was not monitored. When I was logged in after incident, memory was on normal level, but lack of memory was direct reason of these problems.

Long term solutions:

limit memory available for redis container and select correct eviction policy
setup monitoring of memory to prevent these situation

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

Redis instance on docker disconnect once per few month randomly from ioredis, how to debug?

问题

答案1

无法连接 MongoDB 到 Node.js

当不支持沙盒化时，有没有办法让Bazel使用沙盒目录？

docker compose flyway 无法连接到 postgres。

如何防止Visual Studio或Docker更改主机端口号？

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论