英文:
Error "group docker not found" attempting to start docker:dind service in gitlab ci/cd pipeline on private runner
问题
I am getting an error trying to run docker images in a gitlab ci/cd pipeline on a private runner. The pipline runs correctly on the shared runners.
It looks like the docker:dind service is not starting.
Preparing the "docker" executor
Using Docker executor with image docker:latest ...
Starting service docker:dind ...
Pulling docker image docker:dind ...
Using docker image sha256:e072c2e5e5506659f7d5794b6f47fcaa3bb84c8b165609bf199ce483386cd0fe for docker:dind with digest docker@sha256:a2e34bde4cb23eaef4f3d5016c78f4a7ee06b65f80d07c7ba69a1e262977a97a ...
Waiting for services to be up and running (timeout 30 seconds)...
*** WARNING: Service runner-tx15ndy-project-3223950-concurrent-0-cfd5aa554405b7e4-docker-0 probably didn't start properly.
Health check error:
service "runner-tx15ndy-project-3223950-concurrent-0-cfd5aa554405b7e4-docker-0-wait-for-service" timeout
Health check container logs:
Service container logs:
2023-04-17T13:16:23.860176290Z Certificate request self-signature ok
2023-04-17T13:16:23.860224269Z subject=CN = docker:dind server
2023-04-17T13:16:23.882609115Z /certs/server/cert.pem: OK
2023-04-17T13:16:24.829195915Z Certificate request self-signature ok
2023-04-17T13:16:24.829221989Z subject=CN = docker:dind client
2023-04-17T13:16:24.851576744Z /certs/client/cert.pem: OK
2023-04-17T13:16:24.973935031Z time="2023-04-17T13:16:24.973766812Z" level=info msg="Starting up"
2023-04-17T13:16:24.976897417Z time="2023-04-17T13:16:24.976800011Z" level=warning msg="could not change group /var/run/docker.sock to docker: group docker not found"
...
$ docker build --tag $CAM_NATS_IMAGE .
ERROR: Cannot connect to the Docker daemon at tcp://docker:2375. Is the docker daemon running?
This is the ci/cd configuration with some details removed for clarity:
Build the cam service images
build-cam-service:
stage: build-cam-service
image: docker:latest
services:
- docker:dind
script:
# Build prod images
- docker login -u $CI_REGISTRY_USER -p $CI_REGISTRY_PASSWORD $CI_REGISTRY
[build some images and push to gitlab container registry]
test-cam-requester:
stage: test-cam-service
image: docker:latest
services:
- docker:dind
script:
- docker login -u $CI_REGISTRY_USER -p $CI_REGISTRY_PASSWORD $CI_REGISTRY
[use docker pull and docker run to run and test the images]
This is the private runner configuration:
[[runners]]
name = "dind 2"
url = "https://gitlab.com/"
token = "***"
executor = "docker"
[runners.docker]
tls_verify = false
image = "docker:23.0.1-cli-alpine3.17"
privileged = true
disable_cache = false
volumes = ["/cache"]
shm_size = 0
[runners.cache]
I have also tried:
volumes = ["/var/run/docker.sock:/var/run/docker.sock", "/cache"]
With the above, the pipeline runs, but this results in the test jobs creating containers on the runner itself, not on the docker:dind service host.
What is the correct way to set up a private runner for a docker-in-docker ci/cd pipeline?
英文:
I am getting an error trying to run docker images in a gitlab ci/cd pipeline on a private runner.
The pipline runs correctly on the shared runners.
It looks like the docker:dind service is not starting.
Preparing the "docker" executor
Using Docker executor with image docker:latest ...
Starting service docker:dind ...
Pulling docker image docker:dind ...
Using docker image sha256:e072c2e5e5506659f7d5794b6f47fcaa3bb84c8b165609bf199ce483386cd0fe for docker:dind with digest docker@sha256:a2e34bde4cb23eaef4f3d5016c78f4a7ee06b65f80d07c7ba69a1e262977a97a ...
Waiting for services to be up and running (timeout 30 seconds)...
*** WARNING: Service runner-tx15ndy-project-3223950-concurrent-0-cfd5aa554405b7e4-docker-0 probably didn't start properly.
Health check error:
service "runner-tx15ndy-project-3223950-concurrent-0-cfd5aa554405b7e4-docker-0-wait-for-service" timeout
Health check container logs:
Service container logs:
2023-04-17T13:16:23.860176290Z Certificate request self-signature ok
2023-04-17T13:16:23.860224269Z subject=CN = docker:dind server
2023-04-17T13:16:23.882609115Z /certs/server/cert.pem: OK
2023-04-17T13:16:24.829195915Z Certificate request self-signature ok
2023-04-17T13:16:24.829221989Z subject=CN = docker:dind client
2023-04-17T13:16:24.851576744Z /certs/client/cert.pem: OK
2023-04-17T13:16:24.973935031Z time="2023-04-17T13:16:24.973766812Z" level=info msg="Starting up"
2023-04-17T13:16:24.976897417Z time="2023-04-17T13:16:24.976800011Z" level=warning msg="could not change group /var/run/docker.sock to docker: group docker not found"
...
$ docker build --tag $CAM_NATS_IMAGE .
ERROR: Cannot connect to the Docker daemon at tcp://docker:2375. Is the docker daemon running?
This is the ci/cd configuration with some details removed for clarity:
# Build the cam service images
build-cam-service:
stage: build-cam-service
image: docker:latest
services:
- docker:dind
script:
# Build prod images
- docker login -u $CI_REGISTRY_USER -p $CI_REGISTRY_PASSWORD $CI_REGISTRY
[build some images and push to gitlab container registry]
test-cam-requester:
stage: test-cam-service
image: docker:latest
services:
- docker:dind
script:
- docker login -u $CI_REGISTRY_USER -p $CI_REGISTRY_PASSWORD $CI_REGISTRY
[use docker pull and docker run to run and test the images]
This is the private runner configuration:
[[runners]]
name = "dind 2"
url = "https://gitlab.com/"
token = "***"
executor = "docker"
[runners.docker]
tls_verify = false
image = "docker:23.0.1-cli-alpine3.17"
privileged = true
disable_cache = false
volumes = ["/cache"]
shm_size = 0
[runners.cache]
I have also tried:
volumes = ["/var/run/docker.sock:/var/run/docker.sock", "/cache"]
With the above, the pipeline runs, but this results in the test jobs creating containers on the runner itself, not on the docker:dind service host.
What is the correct way to set up a private runner for a docker-in-docker ci/cd pipeline?
答案1
得分: 1
Your runner configuration should include reference to docker certificates. For instance, assuming you use Kubernetes GitLab executor, your runner configuration may look like below:
runners:
tags: "docker-on-prem"
executor: kubernetes
name: "dind"
config: |
[[runners]]
[runners.kubernetes]
namespace = "{{.Release.Namespace}}"
privileged = true
[[runners.kubernetes.volumes.empty_dir]]
name = "docker-certs"
mount_path = "/certs/client"
medium = "Memory"
# creates service account necessary for scheduling Kubernetes executors for each new pipeline job
rbac:
create: true
Now, your .gitlab-ci.yml
has to reference docker certificates and point to docker daemon:
stages:
- build
services:
- name: docker:20.10.16-dind
command: ["--mtu=1300"]
build:
tags: [docker-on-prem]
image: docker:20.10.16
stage: build
variables:
DOCKER_HOST: "tcp://docker:2376"
DOCKER_TLS_CERTDIR: "/certs"
DOCKER_TLS_VERIFY: "1"
DOCKER_CERT_PATH: "$DOCKER_TLS_CERTDIR/client"
script:
- docker build -t test .
You can see the above in a fully working GitLab Self-Hosted Runners Demo and see more info in gitlab docs.
Disclaimer: I wrote the above article.
英文:
Your runner configuration should include reference to docker certificates. For instance, assuming you use Kubernetes GitLab executor, your runner configuration may look like below:
runners:
tags: "docker-on-prem"
executor: kubernetes
name: "dind"
config: |
[[runners]]
[runners.kubernetes]
namespace = "{{.Release.Namespace}}"
privileged = true
[[runners.kubernetes.volumes.empty_dir]]
name = "docker-certs"
mount_path = "/certs/client"
medium = "Memory"
# creates service account necessary for scheduling Kubernetes executors for each new pipeline job
rbac:
create: true
Now, your .gitlab-ci.yml
has to reference docker certificates and point to docker daemon:
stages:
- build
services:
- name: docker:20.10.16-dind
command: ["--mtu=1300"]
build:
tags: [docker-on-prem]
image: docker:20.10.16
stage: build
variables:
DOCKER_HOST: "tcp://docker:2376"
DOCKER_TLS_CERTDIR: "/certs"
DOCKER_TLS_VERIFY: "1"
DOCKER_CERT_PATH: "$DOCKER_TLS_CERTDIR/client"
script:
- docker build -t test .
You can see the above in a fully working GitLab Self-Hosted Runners Demo and see more info in gitlab docs.
Disclaimer: I wrote the above article.
答案2
得分: 0
Pretty much every time I had this issue it was actually caused by DOCKER_DRIVER: overlay because the overlay fs module wasn't installed/loaded on the docker host system (I guess the same applies to overlay2). After installing the missing module or removing the line in the .gitlab-ci.yml it worked fine even without the /var/run/docker.sock:/var/run/docker.sock volume binding (At least in my cases). And ensure that privileged is set to true.
https://gitlab.com/gitlab-org/gitlab-runner/-/issues/1986#note_33053623
英文:
Pretty much every time I had this issue it was actually caused by
DOCKER_DRIVER: overlay
because the overlay fs module wasn't installed/loaded on the docker host system (I guess the same applies to overlay2).
After installing the missing module or removing the line in the .gitlab-ci.yml it worked fine even without the /var/run/docker.sock:/var/run/docker.sock volume binding (Atleast in my cases).
And ensure that privileged is set to true.
https://gitlab.com/gitlab-org/gitlab-runner/-/issues/1986#note_33053623
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论