如何加速GitLab Runner?

huangapple go评论56阅读模式
英文:

How to speed up GitLab runner?

问题

我有一个通过Docker在t2.medium AWS实例上的GitLab Runner(磁盘是gp3),用于以下的.gitlab-ci.yml

# 这个文件是一个模板,可能需要在在你的项目上工作之前进行编辑。
# 要为CI/CD模板贡献改进,请按照开发指南进行操作:
# https://docs.gitlab.com/ee/development/cicd/templates.html
# 这个特定的模板位于:
# https://gitlab.com/gitlab-org/gitlab/-/blob/master/lib/gitlab/ci/templates/Python.gitlab-ci.yml

# 官方语言镜像。在这里查找不同的标记版本:
# https://hub.docker.com/r/library/python/tags/
image: python:3.10

# 将pip的缓存目录更改为项目目录内,因为我们只能缓存本地项目。
variables:
    PIP_CACHE_DIR: "$CI_PROJECT_DIR/.cache/pip"

# Pip的缓存不会存储Python包
# https://pip.pypa.io/en/stable/topics/caching/
#
# 如果您还想缓存已安装的包,您必须在虚拟环境中安装它们并进行缓存。
cache:
    paths:
        - .cache/pip
        - venv/

before_script:
    - python --version ; pip --version # 用于调试
    - pip install virtualenv
    - virtualenv venv
    - source venv/bin/activate

stages:
    - build
    - lint

build:
    stage: build
    script:
        - pip install -r requirements-dev.txt

lint:
    stage: lint
    script:
        - flake8 .
        - mypy src

formatting:
    stage: lint
    script:
        - black --check .
        - isort --check .

我有以下问题:

  • 它运行非常慢(例如,build 阶段需要 8 分钟),尤其是缓存。
  • 由于某些未知原因,缓存创建在每个作业之后发生。我不太明白为什么,因为 lintingformatting 作业不会更改 venv 目录中的任何内容。
  • 经过几次运行后,由于磁盘已满(仅 16 GB),缓存创建失败。如何教GitLab Runner清理磁盘?
英文:

I have a GitLab runner via Docker on a t2.medium AWS instance (disk is gp3) for the following .gitlab-ci.yml:

# This file is a template, and might need editing before it works on your project.
# To contribute improvements to CI/CD templates, please follow the Development guide at:
# https://docs.gitlab.com/ee/development/cicd/templates.html
# This specific template is located at:
# https://gitlab.com/gitlab-org/gitlab/-/blob/master/lib/gitlab/ci/templates/Python.gitlab-ci.yml

# Official language image. Look for the different tagged releases at:
# https://hub.docker.com/r/library/python/tags/
image: python:3.10

# Change pip's cache directory to be inside the project directory since we can
# only cache local items.
variables:
    PIP_CACHE_DIR: "$CI_PROJECT_DIR/.cache/pip"

# Pip's cache doesn't store the python packages
# https://pip.pypa.io/en/stable/topics/caching/
#
# If you want to also cache the installed packages, you have to install
# them in a virtualenv and cache it as well.
cache:
    paths:
        - .cache/pip
        - venv/

before_script:
    - python --version ; pip --version # For debugging
    - pip install virtualenv
    - virtualenv venv
    - source venv/bin/activate

stages:
    - build
    - lint

build:
    stage: build
    script:
        - pip install -r requirements-dev.txt

lint:
    stage: lint
    script:
        - flake8 .
        - mypy src

formatting:
    stage: lint
    script:
        - black --check .
        - isort --check .

I have the following problems:

  • it works super slow (for example, build stage takes 8 minutes), especially caching
  • for some unknown reasons, cache creation occurs after every job. I do not really understand why, as linting and formatting jobs do not change anything inside venv directory.
  • after several runs cache creation fails as disk becomes full (it only 16 GB). How to teach GitLab runner to clean the disk?

答案1

得分: 1

以下是翻译好的部分:

首先,您可以创建一个包含您的Python要求的Docker镜像,然后将其用作作业的Docker镜像(GitLab允许您指定自定义Docker镜像来运行)。

对于缓存问题,我不太确定。但如果要求已经预先安装,也许上述方法可以帮助。唯一的注意事项是,如果您更改或更新这些要求,您可能需要一个更新的镜像来反映这些更改,这可能会增加额外的工作量。

至于磁盘空间问题的最后部分,就像您有一个before_script作业一样,您可以拥有一个after_script作业。您可以使用此作业来删除不需要的文件,并通过删除那些可能比某个特定日期旧的文件来节省磁盘空间。

英文:

There is some things you could try to speed up what you are doing.

Firstly, you could create a docker image that contains your python requirements in it, then use that as the docker image for your job (gitlab allow you to specify custom docker images to run on).

I’m not 100% on the caching issue I’m afraid. But maybe the above can help if the requirements are pre installed. The only caveat is that if you change these or update them, you’d need an updated image to reflect the change which is maybe too much overhead.

And on the last part with the issue on disk space. Just like you have a before_script job, you can have an after_script job. You could use this to remove any files that aren’t required and save on disk space by deleting things that are either older than a certain date perhaps?

huangapple
  • 本文由 发表于 2023年5月15日 05:06:19
  • 转载请务必保留本文链接:https://go.coder-hub.com/76249677.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定