Docker Compose 为何时断时续地失去容器间通信?

huangapple go评论76阅读模式
英文:

Why does docker compose intermittently lose communication between containers?

问题

我只返回翻译的部分:

"docker-compose" 配置我已经在生产环境中使用了几年,直到上周出现了两次随机崩溃...

突然上周,我开始看到来自 "web" 进程的错误:

  • "could not translate host name "db" to address: Name or service not known"
  • "Error -2 connecting to redis:6379. Name or service not known."

当我检查进程时,它们似乎都在运行:

但是,容器似乎无法通信,直到我停止了容器并重新启动它们。然后一切都恢复正常,过了几天后,突然又发生了,没有明显的原因...

任何关于可能发生的原因的想法吗?

Docker 版本:24.0.1
操作系统:Linux
Docker Compose 版本:1.24.0

"web" 容器中的 /etc/resolv.conf 文件:

发现在最新崩溃发生时的 /var/log/messages 中有一些线索(今天早上):

以及之前的故障:

英文:

I've been using this docker compose configuration for a couple of years in production now, and it's been fine until randomly crashing twice in the past week...

version: "3.7"
services:
  web:
    image: backend
    build: ..
    restart: unless-stopped
    expose:
      - 8000
    ports:
      - 9109:8000
    env_file:
      - &ENV_FILE ../.env
    depends_on:
      - db
      - worker
    volumes:
      - &MEDIA_VOLUME /srv/media/:/srv/media
      - &STATIC_VOLUME /srv/static/:/srv/static
      - &TMP_VOLUME /tmp/:/tmp/host/
    logging:
      driver: journald
      options:
        tag: docker-web
  worker:
    image: backend
    environment:
      - REMAP_SIGTERM=SIGQUIT
    command: /usr/bin/start-worker.sh
    restart: unless-stopped
    env_file:
      - *ENV_FILE
    depends_on:
      - db
      - redis
      - rabbitmq
    volumes:
      - *MEDIA_VOLUME
      - *STATIC_VOLUME
      - *TMP_VOLUME
    logging:
      driver: journald
      options:
        tag: docker-worker
  db:
    image: mdillon/postgis:11
    shm_size: '256m'
    restart: unless-stopped
    env_file:
      - *ENV_FILE
    volumes:
      - /var/docker-postgres/:/var/lib/postgresql/data/
      - *TMP_VOLUME
    logging:
      driver: journald
      options:
        tag: docker-db
  memcached:
    container_name: memcached
    image: memcached:latest
    ports:
        - "11211:11211"
  rabbitmq:
    image: rabbitmq:management
    ports:
      - 5672:5672
      - 15672:15672
  redis:
    image: redis:latest
    expose:
      - 6379

Suddenly last week I started seeing errors from the web process:

  • could not translate host name "db" to address: Name or service not known
  • Error -2 connecting to redis:6379. Name or service not known.

When I checked on the processes they all seemed to be running:

$ docker-compose ps
      Name                     Command               State                Ports              
---------------------------------------------------------------------------------------------
docker_db_1         docker-entrypoint.sh postgres    Up      5432/tcp                        
docker_rabbitmq_1   docker-entrypoint.sh rabbi ...   Up      15671/tcp, 0.0.0.0:15672->15672/
                                                             tcp,:::15672->15672/tcp,        
                                                             15691/tcp, 15692/tcp, 25672/tcp,
                                                             4369/tcp, 5671/tcp, 0.0.0.0:5672
                                                             ->5672/tcp,:::5672->5672/tcp    
docker_redis_1      docker-entrypoint.sh redis ...   Up      6379/tcp                        
docker_web_1        /bin/sh -c /usr/bin/start.sh     Up      0.0.0.0:9109->8000/tcp,:::9109->
                                                             8000/tcp                        
docker_worker_1     /usr/bin/start-worker.sh         Up                                      
memcached           docker-entrypoint.sh memcached   Up      0.0.0.0:11211->11211/tcp,:::1121
                                                             1->11211/tcp                    

However it seems the containers were unable to communicate as these errors continued indefinitely until I stopped the containers and started them again. Then everything was fine, for a few days, until it suddenly happened again without probable cause...

Any ideas what might be happening?!

$ docker -v
Docker version 24.0.1, build 6802122
$ uname -a
Linux redacted 3.10.0-1160.53.1.el7.x86_64 #1 SMP Fri Jan 14 13:59:45 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
$ docker-compose --version
docker-compose version 1.24.0, build 0aa59064

/etc/resolv.conf in web container:

nameserver 127.0.0.11
options ndots:0

UPDATE

OK found some clues in /var/log/messages at the time of the latest crash (this morning):

# still working at this point
May 27 05:02:07 my-hostname yum[31556]: Updated: docker-buildx-plugin.x86_64 0.10.5-1.el7
May 27 05:02:09 my-hostname yum[31556]: Updated: docker-ce-cli.x86_64 1:24.0.2-1.el7
May 27 05:02:10 my-hostname yum[31556]: Updated: docker-ce-rootless-extras.x86_64 24.0.2-1.el7
May 27 05:02:16 my-hostname yum[31556]: Updated: docker-ce.x86_64 3:24.0.2-1.el7
May 27 05:02:16 my-hostname systemd: Reloading.
May 27 05:02:17 my-hostname systemd: Stopping Docker Application Container Engine...
May 27 05:02:17 my-hostname dockerd: time="2023-05-27T05:02:17.087647049+10:00" level=info msg="Processing signal 'terminated'"
May 27 05:02:17 my-hostname dockerd: time="2023-05-27T05:02:17.113911997+10:00" level=info msg="Daemon shutdown complete"
May 27 05:02:17 my-hostname dockerd: time="2023-05-27T05:02:17.117216292+10:00" level=info msg="stopping event stream following graceful shutdown" error="context canceled" module=libcontainerd namespace=plugins.moby
May 27 05:02:17 my-hostname dockerd: time="2023-05-27T05:02:17.117728466+10:00" level=info msg="stopping event stream following graceful shutdown" error="context canceled" module=libcontainerd namespace=moby
May 27 05:02:17 my-hostname systemd: Stopped Docker Application Container Engine.
May 27 05:02:17 my-hostname systemd: Starting Docker Application Container Engine...
May 27 05:02:17 my-hostname dockerd: time="2023-05-27T05:02:17.301708572+10:00" level=info msg="Starting up"
May 27 05:02:18 my-hostname dockerd: time="2023-05-27T05:02:18.895994520+10:00" level=info msg="[graphdriver] using prior storage driver: overlay2"
May 27 05:02:30 my-hostname dockerd: time="2023-05-27T05:02:30.709737036+10:00" level=info msg="Loading containers: start."
May 27 05:02:30 my-hostname dockerd: time="2023-05-27T05:02:30.754470385+10:00" level=error msg="stream copy error: reading from a closed fifo"
May 27 05:02:30 my-hostname dockerd: time="2023-05-27T05:02:30.756943164+10:00" level=error msg="stream copy error: reading from a closed fifo"
May 27 05:02:30 my-hostname dockerd: time="2023-05-27T05:02:30.798420878+10:00" level=info msg="ignoring event" container=562f53739a6c564fb7ca240a68de87489c5132f513977ae53012ecba752d90c4 module=libcontainerd namespace=moby topic=/tasks/delete type="*events.TaskDelete"
May 27 05:02:30 my-hostname containerd: time="2023-05-27T05:02:30.798418353+10:00" level=info msg="shim disconnected" id=562f53739a6c564fb7ca240a68de87489c5132f513977ae53012ecba752d90c4
May 27 05:02:30 my-hostname containerd: time="2023-05-27T05:02:30.798866207+10:00" level=warning msg="cleaning up after shim disconnected" id=562f53739a6c564fb7ca240a68de87489c5132f513977ae53012ecba752d90c4 namespace=moby
May 27 05:02:30 my-hostname containerd: time="2023-05-27T05:02:30.798950034+10:00" level=info msg="cleaning up dead shim"
May 27 05:02:30 my-hostname containerd: time="2023-05-27T05:02:30.827844408+10:00" level=warning msg="cleanup warnings time=\"2023-05-27T05:02:30+10:00\" level=info msg=\"starting signal loop\" namespace=moby pid=22533 runtime=io.containerd.runc.v2\n"
May 27 05:02:30 my-hostname dockerd: time="2023-05-27T05:02:30.924708741+10:00" level=info msg="Firewalld: docker zone already exists, returning"
May 27 05:02:31 my-hostname firewalld[704]: WARNING: COMMAND_FAILED: '/usr/sbin/iptables -w10 -t nat -D OUTPUT -m addrtype --dst-type LOCAL -j DOCKER' failed: iptables: No chain/target/match by that name.
May 27 05:02:31 my-hostname firewalld[704]: WARNING: COMMAND_FAILED: '/usr/sbin/iptables -w10 -t nat -D PREROUTING' failed: iptables: Bad rule (does a matching rule exist in that chain?).
May 27 05:02:31 my-hostname firewalld[704]: WARNING: COMMAND_FAILED: '/usr/sbin/iptables -w10 -t nat -D OUTPUT' failed: iptables: Bad rule (does a matching rule exist in that chain?).
May 27 05:02:31 my-hostname firewalld[704]: WARNING: COMMAND_FAILED: '/usr/sbin/iptables -w10 -t filter -X DOCKER' failed: iptables: Too many links.
May 27 05:02:31 my-hostname firewalld[704]: WARNING: COMMAND_FAILED: '/usr/sbin/iptables -w10 -t filter -X DOCKER-ISOLATION-STAGE-1' failed: iptables: Too many links.
May 27 05:02:31 my-hostname firewalld[704]: WARNING: COMMAND_FAILED: '/usr/sbin/iptables -w10 -t filter -F DOCKER-ISOLATION' failed: iptables: No chain/target/match by that name.
May 27 05:02:31 my-hostname firewalld[704]: WARNING: COMMAND_FAILED: '/usr/sbin/iptables -w10 -t filter -X DOCKER-ISOLATION' failed: iptables: No chain/target/match by that name.
May 27 05:02:31 my-hostname firewalld[704]: WARNING: COMMAND_FAILED: '/usr/sbin/iptables -w10 -D FORWARD -i br-7630a2794dac -o br-7630a2794dac -j DROP' failed: iptables: Bad rule (does a matching rule exist in that chain?).
May 27 05:02:31 my-hostname dockerd: time="2023-05-27T05:02:31.454941777+10:00" level=info msg="Firewalld: interface br-7630a2794dac already part of docker zone, returning"
May 27 05:02:31 my-hostname dockerd: time="2023-05-27T05:02:31.522751685+10:00" level=info msg="Firewalld: interface br-7630a2794dac already part of docker zone, returning"
May 27 05:02:31 my-hostname firewalld[704]: WARNING: COMMAND_FAILED: '/usr/sbin/iptables -w10 -D FORWARD -i docker0 -o docker0 -j DROP' failed: iptables: Bad rule (does a matching rule exist in that chain?).
May 27 05:02:31 my-hostname dockerd: time="2023-05-27T05:02:31.780958499+10:00" level=info msg="Firewalld: interface docker0 already part of docker zone, returning"
May 27 05:02:31 my-hostname dockerd: time="2023-05-27T05:02:31.844234522+10:00" level=info msg="Firewalld: interface docker0 already part of docker zone, returning"
May 27 05:02:32 my-hostname dockerd: time="2023-05-27T05:02:32.460772601+10:00" level=error msg="failed to populate fields for osl sandbox 9a40c0210cd412288d7c33eb61bad40530ef9ec48f6701bd1e7184c13fa64d3c"
May 27 05:02:32 my-hostname dockerd: time="2023-05-27T05:02:32.485507497+10:00" level=error msg="failed to populate fields for osl sandbox cd0489e190d48bd6f4e1361ffb1ef948c5b62695f1c3cc08c0793a12a36e0a70"
May 27 05:02:32 my-hostname dockerd: time="2023-05-27T05:02:32.511399464+10:00" level=error msg="failed to populate fields for osl sandbox cf16beecf54d69628a896ed823c890010dc59a288545514b3caab4288b96bbd3"
May 27 05:02:32 my-hostname dockerd: time="2023-05-27T05:02:32.538845672+10:00" level=error msg="failed to populate fields for osl sandbox d9d36cdee5fed3740df123decea7d79f189f42de8879500e36c5ce96695603cb"
May 27 05:02:32 my-hostname dockerd: time="2023-05-27T05:02:32.567477583+10:00" level=error msg="failed to populate fields for osl sandbox 6735141218e84ebd4f0dc0bfe178c926eb4aedd878cc957463d44506d2e55e83"
May 27 05:02:32 my-hostname dockerd: time="2023-05-27T05:02:32.597308395+10:00" level=error msg="failed to populate fields for osl sandbox 74703e4f1bd0cc3d22c0eac14e46c3794a8daef00727f775dd4c4389ee728875"
May 27 05:02:32 my-hostname kernel: br-7630a2794dac: port 8(vethdba34b5) entered disabled state
May 27 05:02:32 my-hostname kernel: device vethdba34b5 left promiscuous mode
May 27 05:02:32 my-hostname kernel: br-7630a2794dac: port 8(vethdba34b5) entered disabled state
May 27 05:02:32 my-hostname NetworkManager[708]: <info>  [1685127752.6322] device (vethdba34b5): released from master device br-7630a2794dac
May 27 05:02:33 my-hostname dockerd: time="2023-05-27T05:02:33.030110748+10:00" level=info msg="Removing stale sandbox 8c4bad304b8e2c68b55d49ae928839050b9e6fc20caf369c6d3fae18f2f22f89 (562f53739a6c564fb7ca240a68de87489c5132f513977ae53012ecba752d90c4)"
May 27 05:02:33 my-hostname dockerd: time="2023-05-27T05:02:33.038984969+10:00" level=warning msg="Error (Unable to complete atomic operation, key modified) deleting object [endpoint 7630a2794dacc176f1eca74b359658aa646fc256cf8189af6a0963e182e8f85f 4f0d21ab87e2c9f5bcca66178add3e2bd787af62be9be0bc1ef21e41d1ddab6e], retrying...."
May 27 05:02:33 my-hostname dockerd: time="2023-05-27T05:02:33.055058812+10:00" level=error msg="failed to populate fields for osl sandbox 945e03bae7b186562c5a3d2993f8d2b4d6fcd1ddea6277743aaac5c20dd26b50"
May 27 05:02:33 my-hostname dockerd: time="2023-05-27T05:02:33.055937650+10:00" level=info msg="there are running containers, updated network configuration will not take affect"
May 27 05:02:33 my-hostname dockerd: time="2023-05-27T05:02:33.058198587+10:00" level=info msg="Loading containers: done."
May 27 05:02:33 my-hostname dockerd: time="2023-05-27T05:02:33.163904779+10:00" level=info msg="Docker daemon" commit=659604f graphdriver=overlay2 version=24.0.2
May 27 05:02:33 my-hostname dockerd: time="2023-05-27T05:02:33.164168658+10:00" level=info msg="Daemon has completed initialization"
May 27 05:02:33 my-hostname dockerd: time="2023-05-27T05:02:33.213072781+10:00" level=info msg="API listen on /var/run/docker.sock"
May 27 05:02:33 my-hostname systemd: Started Docker Application Container Engine.
# now failing

and the previous failure:

May 21 07:05:59 my-hostname yum[7323]: Updated: docker-compose-plugin.x86_64 2.18.1-1.el7
May 21 07:06:00 my-hostname yum[7323]: Updated: docker-ce-cli.x86_64 1:24.0.1-1.el7
May 21 07:06:01 my-hostname yum[7323]: Updated: docker-ce-rootless-extras.x86_64 24.0.1-1.el7
# working at this point
May 21 07:06:07 my-hostname yum[7323]: Updated: docker-ce.x86_64 3:24.0.1-1.el7
May 21 07:06:07 my-hostname systemd: Reloading.
May 21 07:06:07 my-hostname systemd: Stopping Docker Application Container Engine...
May 21 07:06:07 my-hostname dockerd: time="2023-05-21T07:06:07.434010268+10:00" level=info msg="Processing signal 'terminated'"
May 21 07:06:07 my-hostname dockerd: time="2023-05-21T07:06:07.462825347+10:00" level=info msg="Daemon shutdown complete"
May 21 07:06:07 my-hostname dockerd: time="2023-05-21T07:06:07.463607568+10:00" level=info msg="stopping event stream following graceful shutdown" error="context canceled" module=libcontainerd namespace=moby
May 21 07:06:07 my-hostname dockerd: time="2023-05-21T07:06:07.463903566+10:00" level=info msg="stopping event stream following graceful shutdown" error="context canceled" module=libcontainerd namespace=plugins.moby
May 21 07:06:07 my-hostname systemd: Stopped Docker Application Container Engine.
May 21 07:06:07 my-hostname systemd: Starting Docker Application Container Engine...
May 21 07:06:07 my-hostname dockerd: time="2023-05-21T07:06:07.712187157+10:00" level=info msg="Starting up"
May 21 07:06:09 my-hostname dockerd: time="2023-05-21T07:06:09.213598321+10:00" level=info msg="[graphdriver] using prior storage driver: overlay2"
May 21 07:06:21 my-hostname dockerd: time="2023-05-21T07:06:21.320119275+10:00" level=info msg="Loading containers: start."
May 21 07:06:21 my-hostname dockerd: time="2023-05-21T07:06:21.428525472+10:00" level=info msg="Firewalld: docker zone already exists, returning"
May 21 07:06:21 my-hostname firewalld[704]: WARNING: COMMAND_FAILED: '/usr/sbin/iptables -w10 -t nat -D OUTPUT -m addrtype --dst-type LOCAL -j DOCKER' failed: iptables: No chain/target/match by that name.
May 21 07:06:21 my-hostname firewalld[704]: WARNING: COMMAND_FAILED: '/usr/sbin/iptables -w10 -t nat -D PREROUTING' failed: iptables: Bad rule (does a matching rule exist in that chain?).
May 21 07:06:21 my-hostname firewalld[704]: WARNING: COMMAND_FAILED: '/usr/sbin/iptables -w10 -t nat -D OUTPUT' failed: iptables: Bad rule (does a matching rule exist in that chain?).
May 21 07:06:21 my-hostname firewalld[704]: WARNING: COMMAND_FAILED: '/usr/sbin/iptables -w10 -t filter -X DOCKER' failed: iptables: Too many links.
May 21 07:06:21 my-hostname firewalld[704]: WARNING: COMMAND_FAILED: '/usr/sbin/iptables -w10 -t filter -X DOCKER-ISOLATION-STAGE-1' failed: iptables: Too many links.
May 21 07:06:21 my-hostname firewalld[704]: WARNING: COMMAND_FAILED: '/usr/sbin/iptables -w10 -t filter -F DOCKER-ISOLATION' failed: iptables: No chain/target/match by that name.
May 21 07:06:21 my-hostname firewalld[704]: WARNING: COMMAND_FAILED: '/usr/sbin/iptables -w10 -t filter -X DOCKER-ISOLATION' failed: iptables: No chain/target/match by that name.
May 21 07:06:22 my-hostname firewalld[704]: WARNING: COMMAND_FAILED: '/usr/sbin/iptables -w10 -D FORWARD -i br-7630a2794dac -o br-7630a2794dac -j DROP' failed: iptables: Bad rule (does a matching rule exist in that chain?).
May 21 07:06:22 my-hostname dockerd: time="2023-05-21T07:06:22.104048740+10:00" level=info msg="Firewalld: interface br-7630a2794dac already part of docker zone, returning"
May 21 07:06:22 my-hostname dockerd: time="2023-05-21T07:06:22.176584026+10:00" level=info msg="Firewalld: interface br-7630a2794dac already part of docker zone, returning"
May 21 07:06:22 my-hostname firewalld[704]: WARNING: COMMAND_FAILED: '/usr/sbin/iptables -w10 -D FORWARD -i docker0 -o docker0 -j DROP' failed: iptables: Bad rule (does a matching rule exist in that chain?).
May 21 07:06:22 my-hostname dockerd: time="2023-05-21T07:06:22.493966150+10:00" level=info msg="Firewalld: interface docker0 already part of docker zone, returning"
May 21 07:06:22 my-hostname dockerd: time="2023-05-21T07:06:22.587591571+10:00" level=info msg="Firewalld: interface docker0 already part of docker zone, returning"
May 21 07:06:23 my-hostname dockerd: time="2023-05-21T07:06:23.228002494+10:00" level=error msg="failed to populate fields for osl sandbox bc76908fce84399c2679fb6ec97763ec2b3ed11cfc61599ae25df14cc99cab81"
May 21 07:06:23 my-hostname dockerd: time="2023-05-21T07:06:23.253473513+10:00" level=error msg="failed to populate fields for osl sandbox dfd2cc470474dd5a0dd7f067fc6988621c53e14c86f06be738b55cb248985965"
May 21 07:06:23 my-hostname dockerd: time="2023-05-21T07:06:23.277796542+10:00" level=error msg="failed to populate fields for osl sandbox 53adb48ffcd597198cfb549eab5f9b0e34b6b2d565a82a22ea3b7cfe8198e48b"
May 21 07:06:23 my-hostname dockerd: time="2023-05-21T07:06:23.303381498+10:00" level=error msg="failed to populate fields for osl sandbox 6f664d964c34d953bc869dd97cb52345e7271b991423f6defc277d57ab1d8d18"
May 21 07:06:23 my-hostname dockerd: time="2023-05-21T07:06:23.333007410+10:00" level=error msg="failed to populate fields for osl sandbox 8c780783dc3060797e991c1fa896e894b25faf94a94ca255731928d339a3fca1"
May 21 07:06:23 my-hostname dockerd: time="2023-05-21T07:06:23.363884199+10:00" level=error msg="failed to populate fields for osl sandbox 90004d1d46832307ae98d041c56cea879787fca8c8c9d20c62df0978784b707f"
May 21 07:06:23 my-hostname dockerd: time="2023-05-21T07:06:23.364885308+10:00" level=error msg="failed to populate fields for osl sandbox 945e03bae7b186562c5a3d2993f8d2b4d6fcd1ddea6277743aaac5c20dd26b50"
May 21 07:06:23 my-hostname dockerd: time="2023-05-21T07:06:23.366407341+10:00" level=info msg="there are running containers, updated network configuration will not take affect"
May 21 07:06:23 my-hostname dockerd: time="2023-05-21T07:06:23.368298228+10:00" level=info msg="Loading containers: done."
May 21 07:06:23 my-hostname dockerd: time="2023-05-21T07:06:23.480334538+10:00" level=info msg="Docker daemon" commit=463850e graphdriver=overlay2 version=24.0.1
May 21 07:06:23 my-hostname dockerd: time="2023-05-21T07:06:23.480582539+10:00" level=info msg="Daemon has completed initialization"
May 21 07:06:23 my-hostname dockerd: time="2023-05-21T07:06:23.537863940+10:00" level=info msg="API listen on /var/run/docker.sock"
May 21 07:06:23 my-hostname systemd: Started Docker Application Container Engine.
# now failing

So I think the trigger for the issue is the automatic update of docker packages by yum in CentOS, and the restarting of the Docker Application Container Engine, after which the networking breaks down. there are running containers, updated network configuration will not take affect sounds particularly suspicious!

Aside from turning off automated yum updates, how can I ensure that the container networking won't break in these situations?

答案1

得分: 1

我认为我找到了问题:https://github.com/moby/moby/issues/45646

它一直运行正常,直到自动升级到 docker-ce.x86_64 3:24.0.1-1.el7,所以我认为这个 Docker 缺陷是在那个时候引入的。

我已经将 docker* 添加到我的 /etc/yum.conf 中的 exclude 配置中,以避免在 moby 缺陷修复之前发生故障重新启动。

英文:

OK, I think I found the issue: https://github.com/moby/moby/issues/45646

It was working for years until the auto-upgrade to docker-ce.x86_64 3:24.0.1-1.el7 so I assume this Docker bug was introduced at that point.

I've added docker* to my exclude config in /etc/yum.conf for now to avoid broken restarts until the moby bug is fixed

答案2

得分: 0

你尝试过使用特定网络吗?

类似这样的:

version: "3.7"
services:
  web:
    image: backend
    build: ..
    restart: unless-stopped
    expose:
      - 8000
    ports:
      - 9109:8000
    env_file:
      - &ENV_FILE ../.env
    depends_on:
      - db
      - worker
    volumes:
      - &MEDIA_VOLUME /srv/media/:/srv/media
      - &STATIC_VOLUME /srv/static/:/srv/static
      - &TMP_VOLUME /tmp/:/tmp/host/
    logging:
      driver: journald
      options:
        tag: docker-web
    networks:
      my_network:

  worker:
    image: backend
    environment:
      - REMAP_SIGTERM=SIGQUIT
    command: /usr/bin/start-worker.sh
    restart: unless-stopped
    env_file:
      - *ENV_FILE
    depends_on:
      - db
      - redis
      - rabbitmq
    volumes:
      - *MEDIA_VOLUME
      - *STATIC_VOLUME
      - *TMP_VOLUME
    logging:
      driver: journald
      options:
        tag: docker-worker
    networks:
      my_network:

  db:
    image: mdillon/postgis:11
    shm_size: '256m'
    restart: unless-stopped
    env_file:
      - *ENV_FILE
    volumes:
      - /var/docker-postgres/:/var/lib/postgresql/data/
      - *TMP_VOLUME
    logging:
      driver: journald
      options:
        tag: docker-db
    networks:
      my_network:

  memcached:
    container_name: memcached
    image: memcached:latest
    ports:
      - "11211:11211"
    networks:
      my_network:

  rabbitmq:
    image: rabbitmq:management
    ports:
      - 5672:5672
      - 15672:15672
    networks:
      my_network:

  redis:
    image: redis:latest
    expose:
      - 6379
    networks: 
      my_network:

networks:
  my_network:
    driver: bridge
英文:

Have you tried with a specific network ?

Like that:

version: "3.7"
services:
web:
image: backend
build: ..
restart: unless-stopped
expose:
- 8000
ports:
- 9109:8000
env_file:
- &ENV_FILE ../.env
depends_on:
- db
- worker
volumes:
- &MEDIA_VOLUME /srv/media/:/srv/media
- &STATIC_VOLUME /srv/static/:/srv/static
- &TMP_VOLUME /tmp/:/tmp/host/
logging:
driver: journald
options:
tag: docker-web
networks:
my_network:
worker:
image: backend
environment:
- REMAP_SIGTERM=SIGQUIT
command: /usr/bin/start-worker.sh
restart: unless-stopped
env_file:
- *ENV_FILE
depends_on:
- db
- redis
- rabbitmq
volumes:
- *MEDIA_VOLUME
- *STATIC_VOLUME
- *TMP_VOLUME
logging:
driver: journald
options:
tag: docker-worker
networks:
my_network:
db:
image: mdillon/postgis:11
shm_size: '256m'
restart: unless-stopped
env_file:
- *ENV_FILE
volumes:
- /var/docker-postgres/:/var/lib/postgresql/data/
- *TMP_VOLUME
logging:
driver: journald
options:
tag: docker-db
networks:
my_network:
memcached:
container_name: memcached
image: memcached:latest
ports:
- "11211:11211"
networks:
my_network:
rabbitmq:
image: rabbitmq:management
ports:
- 5672:5672
- 15672:15672
networks:
my_network:
redis:
image: redis:latest
expose:
- 6379
networks: 
my_network:
networks:
my_network:
driver: bridge

答案3

得分: 0

由于容器之间存在通信问题,我们可以测试DNS解析是否正常工作。检查Docker网络内DNS解析是否正常。您可以尝试运行临时容器并测试服务名称的DNS解析是否有效。例如:

docker run --rm -it --network=<your_network> alpine nslookup db
docker run --rm -it --network=<your_network> alpine nslookup redis

如果存在DNS缓存机制,请确保不会造成任何冲突或过时信息。作为最后的尝试,重启Docker服务可能有助于解决潜在的网络或DNS相关问题。还值得一提的是,更新Docker Compose到最新版本可能有助于解决已知问题或错误。您正在使用的docker-compose版本是1.24.0,构建于0aa59064。此后已发布了许多新版本,请在此处检查最新版本 https://github.com/docker/compose/releases。我因为应用中使用的docker-compose版本较低而遇到了问题。升级到最新版本后问题得以解决,但问题不是完全相同。我不建议在生产环境中进行任何测试,但您可以在同时更新docker-compose版本并在开发或质量保证环境中进行测试以监控问题。我还建议在docker-compose文件中设置网络,以便容器通过外部网络连接。我不确定这是否可以解决名称解析问题,但这是您应该在服务中实施的事项。希望这能有所帮助。

英文:

Since containers are having issues talking to each other, one thing we can test is whether the DNS resolution working fine or not.

Check if DNS resolution is working correctly within the Docker network. You can try running a temporary container and test if DNS resolution works for the service names. For Eg:

> docker run --rm -it --network=<your_network> alpine nslookup db
> docker run --rm -it --network=<your_network> alpine nslookup redis

--> If you have any DNS caching mechanisms in place (e.g., DNS caching on the host machine or within the containers), ensure that they are not causing any conflicts or outdated information.

--> As a last resort Restarting Docker: If the issue persists and you cannot find any other causes, you can try restarting the Docker service itself. This may help resolve any potential networking or DNS-related issues.

--> It's also worth mentioning that updating Docker Compose to the latest versions might help resolve any known issues or bugs. The docker-compose version you are using is docker-compose version 1.24.0, build 0aa59064

There have been so many new versions released after this. please check here for the latest versions
https://github.com/docker/compose/releases

I have been facing issues because of the low version of docker-compose used in my application. once upgraded to the latest one the was fixed. But the issue was not the exact same as you are facing.

I would not suggest any testing in the Prod environment but in parallel, you can update the docker-compose version and test in dev or qa to monitor for issues.

I would also suggest having a network set in the docker-compose file so that containers will be linked through the external network. I am not sure it can resolve this name resolution issue. But this is something you should be implementing in your services.

Hope this will help

答案4

得分: 0

我已将此作为答案编写,而不是评论,出于文本格式的原因。

您可以尝试将以下内容放入 /etc/resolv.conf 中:

search localdomain
nameserver 127.0.0.11
options edns0 trust-ad ndots:0

这是我在运行您的 docker-compose.yml 时得到的结果。

还请检查所有容器中的 /etc/resolv.conf。

英文:

I have written this as an answer, instead of comment, for text formatting reason.

Can you try to put following in /etc/resolv.conf ?

search localdomain
nameserver 127.0.0.11
options edns0 trust-ad ndots:0

which is I've got when I run your docker-compose.yml.

Also, check /etc/resolv.conf in all your containers.

答案5

得分: 0

以下是您要翻译的内容:

"Such issues when upgrading Docker daemon are expected. On upgrade, Docker daemon will shutdown - and existing containers will be restarted by default.

You can configure Docker Daemon to enable Live Restore. Edit or create etc/docker/daemon.json (default path for Docker daemon config on Linux) to specify:

{
  &quot;live-restore&quot;: true
}

Note however it may not work for major upgrade (such as 23.x > 24.x) so you may want to configure automatic update accordingly."

英文:

Such issues when upgrading Docker daemon are expected. On upgrade, Docker daemon will shutdown - and existing containers will be restarted by default.

You can configure Docker Daemon to enable Live Restore. Edit or create etc/docker/daemon.json (default path for Docker daemon config on Linux) to specify:

{
  &quot;live-restore&quot;: true
}

Note however it may not work for major upgrade (such as 23.x > 24.x) so you may want to configure automatic update accordingly.

huangapple
  • 本文由 发表于 2023年5月22日 14:25:12
  • 转载请务必保留本文链接:https://go.coder-hub.com/76303499.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定