Error "group docker not found" attempting to start docker:dind service in gitlab ci/cd pipeline on private runner

huangapple go评论63阅读模式
英文:

Error "group docker not found" attempting to start docker:dind service in gitlab ci/cd pipeline on private runner

问题

I am getting an error trying to run docker images in a gitlab ci/cd pipeline on a private runner. The pipline runs correctly on the shared runners.

It looks like the docker:dind service is not starting.

Preparing the "docker" executor
Using Docker executor with image docker:latest ...
Starting service docker:dind ...
Pulling docker image docker:dind ...
Using docker image sha256:e072c2e5e5506659f7d5794b6f47fcaa3bb84c8b165609bf199ce483386cd0fe for docker:dind with digest docker@sha256:a2e34bde4cb23eaef4f3d5016c78f4a7ee06b65f80d07c7ba69a1e262977a97a ...
Waiting for services to be up and running (timeout 30 seconds)...
*** WARNING: Service runner-tx15ndy-project-3223950-concurrent-0-cfd5aa554405b7e4-docker-0 probably didn't start properly.
Health check error:
service "runner-tx15ndy-project-3223950-concurrent-0-cfd5aa554405b7e4-docker-0-wait-for-service" timeout
Health check container logs:
Service container logs:
2023-04-17T13:16:23.860176290Z Certificate request self-signature ok
2023-04-17T13:16:23.860224269Z subject=CN = docker:dind server
2023-04-17T13:16:23.882609115Z /certs/server/cert.pem: OK
2023-04-17T13:16:24.829195915Z Certificate request self-signature ok
2023-04-17T13:16:24.829221989Z subject=CN = docker:dind client
2023-04-17T13:16:24.851576744Z /certs/client/cert.pem: OK
2023-04-17T13:16:24.973935031Z time="2023-04-17T13:16:24.973766812Z" level=info msg="Starting up"
2023-04-17T13:16:24.976897417Z time="2023-04-17T13:16:24.976800011Z" level=warning msg="could not change group /var/run/docker.sock to docker: group docker not found"
...
$ docker build --tag $CAM_NATS_IMAGE .
ERROR: Cannot connect to the Docker daemon at tcp://docker:2375. Is the docker daemon running?

This is the ci/cd configuration with some details removed for clarity:

Build the cam service images

build-cam-service:
stage: build-cam-service
image: docker:latest
services:
- docker:dind
script:
# Build prod images
- docker login -u $CI_REGISTRY_USER -p $CI_REGISTRY_PASSWORD $CI_REGISTRY
[build some images and push to gitlab container registry]

test-cam-requester:
stage: test-cam-service
image: docker:latest
services:
- docker:dind
script:
- docker login -u $CI_REGISTRY_USER -p $CI_REGISTRY_PASSWORD $CI_REGISTRY
[use docker pull and docker run to run and test the images]

This is the private runner configuration:

[[runners]]
name = "dind 2"
url = "https://gitlab.com/"
token = "***"
executor = "docker"
[runners.docker]
tls_verify = false
image = "docker:23.0.1-cli-alpine3.17"
privileged = true
disable_cache = false
volumes = ["/cache"]
shm_size = 0
[runners.cache]

I have also tried:

volumes = ["/var/run/docker.sock:/var/run/docker.sock", "/cache"]

With the above, the pipeline runs, but this results in the test jobs creating containers on the runner itself, not on the docker:dind service host.

What is the correct way to set up a private runner for a docker-in-docker ci/cd pipeline?

英文:

I am getting an error trying to run docker images in a gitlab ci/cd pipeline on a private runner.
The pipline runs correctly on the shared runners.

It looks like the docker:dind service is not starting.

Preparing the "docker" executor
Using Docker executor with image docker:latest ...
Starting service docker:dind ...
Pulling docker image docker:dind ...
Using docker image sha256:e072c2e5e5506659f7d5794b6f47fcaa3bb84c8b165609bf199ce483386cd0fe for docker:dind with digest docker@sha256:a2e34bde4cb23eaef4f3d5016c78f4a7ee06b65f80d07c7ba69a1e262977a97a ...
Waiting for services to be up and running (timeout 30 seconds)...
*** WARNING: Service runner-tx15ndy-project-3223950-concurrent-0-cfd5aa554405b7e4-docker-0 probably didn't start properly.
Health check error:
service "runner-tx15ndy-project-3223950-concurrent-0-cfd5aa554405b7e4-docker-0-wait-for-service" timeout
Health check container logs:
Service container logs:
2023-04-17T13:16:23.860176290Z Certificate request self-signature ok
2023-04-17T13:16:23.860224269Z subject=CN = docker:dind server
2023-04-17T13:16:23.882609115Z /certs/server/cert.pem: OK
2023-04-17T13:16:24.829195915Z Certificate request self-signature ok
2023-04-17T13:16:24.829221989Z subject=CN = docker:dind client
2023-04-17T13:16:24.851576744Z /certs/client/cert.pem: OK
2023-04-17T13:16:24.973935031Z time="2023-04-17T13:16:24.973766812Z" level=info msg="Starting up"
2023-04-17T13:16:24.976897417Z time="2023-04-17T13:16:24.976800011Z" level=warning msg="could not change group /var/run/docker.sock to docker: group docker not found"
...
$ docker build --tag $CAM_NATS_IMAGE .
ERROR: Cannot connect to the Docker daemon at tcp://docker:2375. Is the docker daemon running?

This is the ci/cd configuration with some details removed for clarity:

# Build the cam service images
build-cam-service:
  stage: build-cam-service
  image: docker:latest
  services:
    - docker:dind
  script:
    # Build prod images
    - docker login -u $CI_REGISTRY_USER -p $CI_REGISTRY_PASSWORD $CI_REGISTRY      
    [build some images and push to gitlab container registry]       

test-cam-requester:
  stage: test-cam-service
  image: docker:latest
  services:
    - docker:dind
  script:
      - docker login -u $CI_REGISTRY_USER -p $CI_REGISTRY_PASSWORD $CI_REGISTRY  
	  [use docker pull and docker run to run and test the images]

This is the private runner configuration:

[[runners]]
  name = "dind 2"
  url = "https://gitlab.com/"
  token = "***"
  executor = "docker"
  [runners.docker]
    tls_verify = false
    image = "docker:23.0.1-cli-alpine3.17"
    privileged = true
    disable_cache = false
    volumes = ["/cache"]
    shm_size = 0
  [runners.cache]

I have also tried:

volumes = ["/var/run/docker.sock:/var/run/docker.sock", "/cache"]

With the above, the pipeline runs, but this results in the test jobs creating containers on the runner itself, not on the docker:dind service host.

What is the correct way to set up a private runner for a docker-in-docker ci/cd pipeline?

答案1

得分: 1

Your runner configuration should include reference to docker certificates. For instance, assuming you use Kubernetes GitLab executor, your runner configuration may look like below:

runners:   
  tags: "docker-on-prem"
  executor: kubernetes
  name: "dind"
  config: |
    [[runners]]
      [runners.kubernetes]
        namespace = "{{.Release.Namespace}}" 
        privileged = true     
      [[runners.kubernetes.volumes.empty_dir]]
        name = "docker-certs"
        mount_path = "/certs/client"
        medium = "Memory"    
# creates service account necessary for scheduling Kubernetes executors for each new pipeline job
rbac:
  create: true

Now, your .gitlab-ci.yml has to reference docker certificates and point to docker daemon:

stages:
  - build

services:  
  - name: docker:20.10.16-dind 
    command: ["--mtu=1300"]    

build:  
  tags: [docker-on-prem]
  image: docker:20.10.16
  stage: build
  variables:
    DOCKER_HOST: "tcp://docker:2376"
    DOCKER_TLS_CERTDIR: "/certs"
    DOCKER_TLS_VERIFY: "1"
    DOCKER_CERT_PATH: "$DOCKER_TLS_CERTDIR/client"
  script:    
    - docker build -t test .

You can see the above in a fully working GitLab Self-Hosted Runners Demo and see more info in gitlab docs.

Disclaimer: I wrote the above article.

英文:

Your runner configuration should include reference to docker certificates. For instance, assuming you use Kubernetes GitLab executor, your runner configuration may look like below:

runners:   
  tags: "docker-on-prem"
  executor: kubernetes
  name: "dind"
  config: |
    [[runners]]
      [runners.kubernetes]
        namespace = "{{.Release.Namespace}}" 
        privileged = true     
      [[runners.kubernetes.volumes.empty_dir]]
        name = "docker-certs"
        mount_path = "/certs/client"
        medium = "Memory"
# creates service account necessary for scheduling Kubernetes executors for each new pipeline job
rbac:
  create: true

Now, your .gitlab-ci.yml has to reference docker certificates and point to docker daemon:

stages:
  - build

services:  
  - name: docker:20.10.16-dind 
    command: ["--mtu=1300"]    


build:  
  tags: [docker-on-prem]
  image: docker:20.10.16
  stage: build
  variables:
    DOCKER_HOST: "tcp://docker:2376"
    DOCKER_TLS_CERTDIR: "/certs"
    DOCKER_TLS_VERIFY: "1"
    DOCKER_CERT_PATH: "$DOCKER_TLS_CERTDIR/client"
  script:    
    - docker build -t test .

You can see the above in a fully working GitLab Self-Hosted Runners Demo and see more info in gitlab docs.

Disclaimer: I wrote the above article.

答案2

得分: 0

Pretty much every time I had this issue it was actually caused by DOCKER_DRIVER: overlay because the overlay fs module wasn't installed/loaded on the docker host system (I guess the same applies to overlay2). After installing the missing module or removing the line in the .gitlab-ci.yml it worked fine even without the /var/run/docker.sock:/var/run/docker.sock volume binding (At least in my cases). And ensure that privileged is set to true.

https://gitlab.com/gitlab-org/gitlab-runner/-/issues/1986#note_33053623

英文:

Pretty much every time I had this issue it was actually caused by
DOCKER_DRIVER: overlay
because the overlay fs module wasn't installed/loaded on the docker host system (I guess the same applies to overlay2).
After installing the missing module or removing the line in the .gitlab-ci.yml it worked fine even without the /var/run/docker.sock:/var/run/docker.sock volume binding (Atleast in my cases).
And ensure that privileged is set to true.

https://gitlab.com/gitlab-org/gitlab-runner/-/issues/1986#note_33053623

huangapple
  • 本文由 发表于 2023年4月17日 21:57:59
  • 转载请务必保留本文链接:https://go.coder-hub.com/76035968.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定