Building image and storing in cache, but those stages are still being built after docker load and –cache-from in docker build

huangapple go评论75阅读模式
英文:

Building image and storing in cache, but those stages are still being built after docker load and --cache-from in docker build

问题

重构我们的Azure Pipelines以尝试加快它们的速度。

目前,对于我们的unit-testingintegration-testing阶段,它正在重新构建相同的依赖阶段... 这实际上是非常慢和低效的。

  • 我正在构建带有--target development的镜像,并将其存储在Cache@2中。
  • 在我们的UnitTest阶段,一个任务成功地从缓存中使用docker load -i加载它,并使用docker images确认。
  • 然后它进入到实际运行unit-tests阶段,在这个阶段我使用了--cache-from=--target unit-tests
  • 在管道中,我可以看到它确认了缓存,但它仍然构建了应该已经有的缓存镜像的阶段:
#4 从companyapp-api:pr-api导入缓存清单
#4 sha256:7c6bf1eebafe5af983d68e3fb7d72c271b8a80918f9799979ebd2b2bea604d10
#4 已完成 0.0s

#5 [python-base 1/1] 从docker.io/library/python:3.9-slim导入
#5 sha256:f876c6f14c8c365d299789228d8a0c38ac92e17ea62116c830f5b7c6bc684e47
#5 已完成 0.0s

至于我正在处理的文件...

# ./api/docker/Dockerfile

# 创建具有共享环境变量的Python基础镜像
FROM python:3.9-slim as python-base
ENV PYTHONUNBUFFERED=1 \
    PYTHONDONTWRITEBYTECODE=1 \
    PIP_NO_CACHE_DIR=off \
    PIP_DISABLE_PIP_VERSION_CHECK=on \
    PIP_DEFAULT_TIMEOUT=100 \
    POETRY_HOME="/opt/poetry" \
    POETRY_VIRTUALENVS_IN_PROJECT=true \
    POETRY_NO_INTERACTION=1 \
    PYSETUP_PATH="/opt/pysetup" \
    VENV_PATH="/opt/pysetup/.venv";

ENV PATH="$POETRY_HOME/bin:$VENV_PATH/bin:$PATH";


# builder-base用于构建依赖项
FROM python-base as builder-base
RUN apt-get update \
    && apt-get install --no-install-recommends -y \
        curl \
        build-essential 
        
# 安装Poetry - 遵守$POETRY_VERSION和$POETRY_HOME
ENV POETRY_VERSION=1.4.1 GET_POETRY_IGNORE_DEPRECATION=1
RUN curl -sSL https://install.python-poetry.org | python3 -

# 我们在这里复制Python要求以进行缓存,并且只在运行时使用Poetry安装
WORKDIR $PYSETUP_PATH
COPY ./poetry.lock ./pyproject.toml ./
RUN poetry install --no-dev     


# 'development'阶段安装所有开发依赖项,可用于开发代码。
# 例如,使用docker-compose在/app下挂载本地卷
FROM python-base as development

# 复制poetry和venv到镜像中
COPY --from=builder-base $POETRY_HOME $POETRY_HOME
COPY --from=builder-base $PYSETUP_PATH $PYSETUP_PATH

# 复制我们的入口点
# COPY ./docker/docker-entrypoint.sh /docker-entrypoint.sh
RUN chmod +x . /opt/pysetup/.venv/bin/activate

# venv已经安装了运行时依赖项,我们可以更快地安装
WORKDIR $PYSETUP_PATH
RUN poetry install
WORKDIR /app
COPY . .
EXPOSE 5000 5672
CMD ["python", "src/manage.py", "runserver", "0.0.0.0:5000"]


# 'unit-tests'阶段使用unittest和coverage运行我们的单元测试。   
FROM development AS unit-tests
RUN coverage run --omit='src/manage.py,src/config/*,*/.venv/*,*/*__init__.py,*/tests.py,*/admin.py' src/manage.py test src --tag=ut &&
    coverage report
# ./pipelines/pr.yaml
# # 这由PR和分支策略触发
触发器: 无

# 读取基本变量模板
变量:
  imageRepository: companyapp
  dockerfilePath: $(Build.SourcesDirectory)
  vmImageName: ubuntu-latest

# 使用ubuntu-latest镜像
池:
  vmIMage: $(vmImageName)

阶段:
- 阶段: 构建
  显示名称: 为测试构建镜像...
  工作:
  - 任务: BuildingAndCache
    显示名称: 构建和缓存测试镜像...
    步骤: 

    - 任务: Cache@2
      显示名称: 创建缓存...
      输入: 
        key: 'docker | "$(Agent.OS)" | cache'
        path: $(Pipeline.Workspace)/docker
        cacheHitVar: CACHE_RESTORED

    - 任务: Docker@2
      显示名称: 构建测试镜像...
      输入:
        command: 'build'
        repository: $(imageRepository)-$(service)
        dockerfile: $(dockerFilePath)/$(service)/docker/Dockerfile
        buildContext: $(dockerFilePath)/$(service)
        arguments: |
          --target development
        tags: |
          pr-$(service)
      env:
        DOCKER_BUILDKIT: 1

    - bash: |
        mkdir -p $(Pipeline.Workspace)/docker
        docker save -o $(Pipeline.Workspace)/docker/cache.tar $(imageRepository)-$(service):pr-$(service)
      显示名称: 保存镜像到缓存...
      # condition: and(not(canceled()), not(failed()), ne(variables.CACHE_RESTORED, 'true'))
        

- 阶段: UnitTest
  显示名称: 运行单元测试...
  工作:
  - 任务: UnitTesting
    显示名称: 运行单元测试...
    步骤: 
    - 任务: Cache@2
      显示名称: 检查现有镜像的缓存...
      输入: 
        key: 'docker | "$(Agent.OS)" | cache'
        path: $(Pipeline.Work

<details>
<summary>英文:</summary>

Reworking our Azure Pipelines to try and speed them up. 

Currently it is rebuilding the same dependent stages for our `unit-testing` and `integration-testing` stages... which is really slow and inefficient.

- I&#39;m building the image which `--target development`, storing it in cache with `Cache@2`. 
- In our `UnitTest` stage, a task successfully loads it from cache using `docker load -i` and confirmed with `docker images`.
- Then it gets to the actual running of the `unit-tests` stage where I use `--cache-from=` and `--target unit-tests`.
- In the pipeline I can see it acknowledging the cache, but it still builds the stages the cached image should already have:

#4 importing cache manifest from companyapp-api:pr-api
#4 sha256:7c6bf1eebafe5af983d68e3fb7d72c271b8a80918f9799979ebd2b2bea604d10
#4 DONE 0.0s

#5 [python-base 1/1] FROM docker.io/library/python:3.9-slim
#5 sha256:f876c6f14c8c365d299789228d8a0c38ac92e17ea62116c830f5b7c6bc684e47
#5 DONE 0.0s


As for the files I&#39;m working with...

./api/docker/Dockerfile

creating a python base with shared environment variables

FROM python:3.9-slim as python-base
ENV PYTHONUNBUFFERED=1
PYTHONDONTWRITEBYTECODE=1
PIP_NO_CACHE_DIR=off
PIP_DISABLE_PIP_VERSION_CHECK=on
PIP_DEFAULT_TIMEOUT=100
POETRY_HOME="/opt/poetry"
POETRY_VIRTUALENVS_IN_PROJECT=true
POETRY_NO_INTERACTION=1
PYSETUP_PATH="/opt/pysetup"
VENV_PATH="/opt/pysetup/.venv"

ENV PATH="$POETRY_HOME/bin:$VENV_PATH/bin:$PATH"

builder-base is used to build dependencies

FROM python-base as builder-base
RUN apt-get update
&& apt-get install --no-install-recommends -y
curl
build-essential

Install Poetry - respects $POETRY_VERSION & $POETRY_HOME

ENV POETRY_VERSION=1.4.1 GET_POETRY_IGNORE_DEPRECATION=1
RUN curl -sSL https://install.python-poetry.org | python3 -

We copy our Python requirements here to cache them

and install on ly runtime deps using poetry

WORKDIR $PYSETUP_PATH
COPY ./poetry.lock ./pyproject.toml ./
RUN poetry install --no-dev

'development' stage installs all dev deps and can be used to develop code.

For example using docker-compose to mount local volume under /app

FROM python-base as development

Copying poetry and venv into image

COPY --from=builder-base $POETRY_HOME $POETRY_HOME
COPY --from=builder-base $PYSETUP_PATH $PYSETUP_PATH

Copying in our entrypoint

COPY ./docker/docker-entrypoint.sh /docker-entrypoint.sh

RUN chmod +x . /opt/pysetup/.venv/bin/activate

venv already has runtime deps installed we get a quicker install

WORKDIR $PYSETUP_PATH
RUN poetry install
WORKDIR /app
COPY . .
EXPOSE 5000 5672
CMD [ "python", "src/manage.py", "runserver", "0.0.0.0:5000"]

'unit-tests' stage runs our unit tests with unittest and coverage.

FROM development AS unit-tests
RUN coverage run --omit='src/manage.py,src/config/,/.venv/,/init.py,/tests.py,*/admin.py' src/manage.py test src --tag=ut &&
coverage report

./pipelines/pr.yaml

# This is triggered by the PR and branch policies

trigger: none

Read in the base variable template

variables:
imageRepository: companyapp
dockerfilePath: $(Build.SourcesDirectory)
vmImageName: ubuntu-latest

Use the ubuntu-latest image

pool:
vmIMage: $(vmImageName)

stages:

  • stage: Build
    displayName: Build image for tests...
    jobs:

    • job: BuildingAndCache
      displayName: Building and caching image for tests...
      steps:

      • task: Cache@2
        displayName: Creating cache...
        inputs:
        key: 'docker | "$(Agent.OS)" | cache'
        path: $(Pipeline.Workspace)/docker
        cacheHitVar: CACHE_RESTORED

      • task: Docker@2
        displayName: Building image for tests...
        inputs:
        command: 'build'
        repository: $(imageRepository)-$(service)
        dockerfile: $(dockerFilePath)/$(service)/docker/Dockerfile
        buildContext: $(dockerFilePath)/$(service)
        arguments: |
        --target development
        tags: |
        pr-$(service)
        env:
        DOCKER_BUILDKIT: 1

      • bash: |
        mkdir -p $(Pipeline.Workspace)/docker
        docker save -o $(Pipeline.Workspace)/docker/cache.tar $(imageRepository)-$(service):pr-$(service)
        displayName: Saving image to cache...

        condition: and(not(canceled()), not(failed()), ne(variables.CACHE_RESTORED, 'true'))

  • stage: UnitTest
    displayName: Run unit tests...
    jobs:

    • job: UnitTesting
      displayName: Running unit tests...
      steps:

      • task: Cache@2
        displayName: Checking cache for existing images...
        inputs:
        key: 'docker | "$(Agent.OS)" | cache'
        path: $(Pipeline.Workspace)/docker
        cacheHitVar: CACHE_RESTORED

      • script: |
        docker load -i $(Pipeline.Workspace)/docker/cache.tar
        docker images
        displayName: Loading existing image from cache...
        condition: and(not(canceled()), eq(variables.CACHE_RESTORED, 'true'))

      • task: Docker@2
        displayName: Running unit-tests...
        inputs:
        command: 'build'
        repository: $(imageRepository)-$(service)
        dockerfile: $(dockerFilePath)/$(service)/docker/Dockerfile
        buildContext: $(dockerFilePath)/$(service)
        arguments: |
        --cache-from=$(imageRepository)-$(service):pr-$(service)
        --target unit-tests

        tags: |

        pr-$(service)

        env:
        DOCKER_BUILDKIT: 1

**Any suggestions for what I&#39;m doing wrong and how to resolve it?**


---
The resources I&#39;ve been consulting:
- https://docs.docker.com/build/building/multi-stage/
- https://github.com/michaeloliverx/python-poetry-docker-example/blob/master/docker/Dockerfile
- https://learn.microsoft.com/en-us/azure/devops/pipelines/release/caching?view=azure-devops#docker-images
- https://stackoverflow.com/questions/59266670/how-to-enable-docker-layer-caching-in-azure-devops
- Tried ChatGPT and all it is doing is providing what I&#39;m currently doing as an answer.

</details>


# 答案1
**得分**: 0

已经找到问题,这更多是一个疏忽:`Dockerfile` 寻找的图像与在 `pr.yaml` 中构建的图像的名称不匹配。

我需要更新:

...
FROM development AS unit-tests
...

以匹配此处正在构建的图像的名称:

...
- task: Docker@2
displayName: 为测试构建镜像...
inputs:
command: 'build'
repository: $(imageRepository)-$(service)
dockerfile: $(dockerFilePath)/$(service)/docker/Dockerfile
buildContext: $(dockerFilePath)/$(service)
arguments: |
--target development
tags: |
pr-$(service)
env:
DOCKER_BUILDKIT: 1

- bash: |
    mkdir -p $(Pipeline.Workspace)/docker
    docker save -o $(Pipeline.Workspace)/docker/cache.tar $(imageRepository)-$(service):pr-$(service)
  displayName: 将镜像保存到缓存中...

...

正在构建的图像的名称是:

companyapp-api:pr-api

因此,`Dockerfile` 应该是:

...
FROM companyapp-api:pr-api AS unit-tests
...

一旦我这样做了,`unit-tests` 状态只需要 10 秒,而不是 1 分 20 秒。

<details>
<summary>英文:</summary>

Figured it out and was more of an oversight: the image the `Dockerfile` was looking for didn&#39;t match the name of the image that was built in the `pr.yaml`.

I needed to update:

...
FROM development AS unit-tests
...

To match the name of the image being built here:

...
- task: Docker@2
displayName: Building image for tests...
inputs:
command: 'build'
repository: $(imageRepository)-$(service)
dockerfile: $(dockerFilePath)/$(service)/docker/Dockerfile
buildContext: $(dockerFilePath)/$(service)
arguments: |
--target development
tags: |
pr-$(service)
env:
DOCKER_BUILDKIT: 1

- bash: |
    mkdir -p $(Pipeline.Workspace)/docker
    docker save -o $(Pipeline.Workspace)/docker/cache.tar $(imageRepository)-$(service):pr-$(service)
  displayName: Saving image to cache...

...

The name of the image being built was:

companyapp-api:pr-api

So the `Dockerfile` should have been:

...
FROM companyapp-api:pr-api AS unit-tests
...

Once I did that, the `unit-tests` state took 10s instead of 1m20s.

</details>



huangapple
  • 本文由 发表于 2023年6月9日 07:35:27
  • 转载请务必保留本文链接:https://go.coder-hub.com/76436331.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定