Packages not installed during Docker build

huangapple go评论70阅读模式
英文:

Packages not installed during Docker build

问题

我正在尝试在基于python:3.10镜像的Docker容器中安装tesseract-ocr。在构建过程中,安装似乎进行得很顺利,但然后我无法在容器内找到文件。如果我随后打开容器并在容器内安装它,它就可以正常工作。

我的Dockerfile的相关部分如下:

# 基于debian的镜像
FROM python:3.10
WORKDIR /code
RUN mkdir __logger

RUN apt-get update -y
RUN apt-get install apt-utils -y

# tesseract部分,尝试了apt和apt-get
RUN apt-get install tesseract-ocr -y

COPY ./requirements.txt ./
RUN pip install --upgrade pip
RUN pip install --no-cache-dir -r requirements.txt
COPY . .
CMD ["python", "./app.py"]

然后我使用docker compose up运行容器,进入容器并使用docker exec -t -i my_container_name /bin/bash命令,最后尝试find / -type d -name "tesseract*",但没有结果。

如果我运行apt-cache search tesseract-ocr,我可以看到它在列表中可用。

然后,如果我在容器终端内运行apt install tesseract-ocr,我可以看到文件已安装。然后,如果我再次运行find / -type d -name "tesseract*",我可以看到tesseract现在已安装。

如何使它在构建阶段正确安装?

这是构建过程末尾的RUN apt-get install tesseract-ocr -y的日志片段:

#18 4.079 Preparing to unpack .../5-tesseract-ocr-osd_1%3a4.00~git30-7274cfa-1.1_all.deb ...
#18 4.086 Unpacking tesseract-ocr-osd (1:4.00~git30-7274cfa-1.1) ...
#18 4.447 Selecting previously unselected package tesseract-ocr.
#18 4.451 Preparing to unpack .../6-tesseract-ocr_4.1.1-2.1_amd64.deb ...
#18 4.463 Unpacking tesseract-ocr (4.1.1-2.1) ...
#18 4.552 Setting up libarchive13:amd64 (3.4.3-2+deb11u1) ...
#18 4.574 Setting up tesseract-ocr-eng (1:4.00~git30-7274cfa-1.1) ...
#18 4.596 Setting up libgif7:amd64 (5.1.9-2) ...
#18 4.618 Setting up tesseract-ocr-osd (1:4.00~git30-7274cfa-1.1) ...
#18 4.640 Setting up liblept5:amd64 (1.79.0-1.1+deb11u1) ...
#18 4.665 Setting up libtesseract4:amd64 (4.1.1-2.1) ...
#18 4.688 Setting up tesseract-ocr (4.1.1-2.1) ...
#18 4.710 Processing triggers for libc-bin (2.31-13+deb11u6) ...
#18 DONE 4.8s 
英文:

I'm trying to install tesseract-ocr in a Docker container based on the python:3.10 image. During the build process it looks like installation goes fine, but then I cannot find the files inside the container. If I then open up the container and install it from within the container it works.

Relevant parts of my Dockerfile looks like this

# debian based
FROM python:3.10
WORKDIR /code
RUN mkdir __logger

RUN apt-get update -y
RUN apt-get install apt-utils -y

# tesseract part, tried both apt & apt-get
RUN apt-get install tesseract-ocr -y

COPY ./requirements.txt ./
RUN pip install --upgrade pip
RUN pip install --no-cache-dir -r requirements.txt
COPY . .
CMD ["python", "./app.py"]

Then I run the container with docker compose up and go into the container with docker exec -t -i my_container_name /bin/bash and finally try find / -type d -name "*tesseract*" which yields no results.

If I run apt-cache search tesseract-ocr I can see it is available in the list.

If I then run apt install tesseract-ocr inside the container terminal, I can see the files are installed. And then if I run find / -type d -name "*tesseract*" again, I can see that now tesseract was installed

root@06d4e841c6d2:/code# find / -type d -name "*tess*"
/usr/share/doc/tesseract-ocr-eng
/usr/share/doc/tesseract-ocr-osd
/usr/share/doc/tesseract-ocr
/usr/share/doc/libtesseract4
/usr/share/tesseract-ocr
/usr/share/tesseract-ocr/4.00/tessdata
/usr/share/tesseract-ocr/4.00/tessdata/tessconfigs

How can I make it work so that it is installed correctly during the build phase?

Here's a snippet of the logs towards the end of the build process for RUN apt-get install tesseract-ocr -y

#18 4.079 Preparing to unpack .../5-tesseract-ocr-osd_1%3a4.00~git30-7274cfa-1.1_all.deb ...
#18 4.086 Unpacking tesseract-ocr-osd (1:4.00~git30-7274cfa-1.1) ...
#18 4.447 Selecting previously unselected package tesseract-ocr.
#18 4.451 Preparing to unpack .../6-tesseract-ocr_4.1.1-2.1_amd64.deb ...
#18 4.463 Unpacking tesseract-ocr (4.1.1-2.1) ...
#18 4.552 Setting up libarchive13:amd64 (3.4.3-2+deb11u1) ...
#18 4.574 Setting up tesseract-ocr-eng (1:4.00~git30-7274cfa-1.1) ...
#18 4.596 Setting up libgif7:amd64 (5.1.9-2) ...
#18 4.618 Setting up tesseract-ocr-osd (1:4.00~git30-7274cfa-1.1) ...
#18 4.640 Setting up liblept5:amd64 (1.79.0-1.1+deb11u1) ...
#18 4.665 Setting up libtesseract4:amd64 (4.1.1-2.1) ...
#18 4.688 Setting up tesseract-ocr (4.1.1-2.1) ...
#18 4.710 Processing triggers for libc-bin (2.31-13+deb11u6) ...
#18 DONE 4.8s 

答案1

得分: 0

无法复现您的问题。我使用以下截断的Dockerfile创建了一个Docker镜像:

# 基于Debian
FROM python:3.10
WORKDIR /code
RUN mkdir __logger

RUN apt-get update -y
RUN apt-get install apt-utils -y

# Tesseract部分,尝试使用apt和apt-get
RUN apt-get install tesseract-ocr -y

然后像这样构建了Docker镜像:docker build --tag stackoverflow:test .

然后登录到容器中,可以像这样找到Tesseract:

% docker run -it stackoverflow:test /bin/bash
root@2e2e3599c939:/code# find / -type d -name "tess*"
/usr/share/doc/tesseract-ocr
/usr/share/doc/libtesseract4
/usr/share/doc/tesseract-ocr-osd
/usr/share/doc/tesseract-ocr-eng
/usr/share/tesseract-ocr
/usr/share/tesseract-ocr/4.00/tessdata
/usr/share/tesseract-ocr/4.00/tessdata/tessconfigs

所以这个问题有点棘手。但是这里有一些可能有助于解决问题的尝试:
1)尝试单独构建Docker容器,而不使用Docker Compose。
2)在构建时尝试使用--no-cache参数来移除缓存。
3)确保您正在运行最新版本的Docker。

英文:

I'm unable to reproduce your problem. I created a docker image with this truncated Dockerfile

# debian based
FROM python:3.10
WORKDIR /code
RUN mkdir __logger

RUN apt-get update -y
RUN apt-get install apt-utils -y

# tesseract part, tried both apt & apt-get
RUN apt-get install tesseract-ocr -y

and then built the docker image like docker build --tag stackoverflow:test .

and then logged into a container and was able to find tesseract like

% docker run -it stackoverflow:test /bin/bash
root@2e2e3599c939:/code# find / -type d -name "*tess*"
/usr/share/doc/tesseract-ocr
/usr/share/doc/libtesseract4
/usr/share/doc/tesseract-ocr-osd
/usr/share/doc/tesseract-ocr-eng
/usr/share/tesseract-ocr
/usr/share/tesseract-ocr/4.00/tessdata
/usr/share/tesseract-ocr/4.00/tessdata/tessconfigs

So this problem is a bit of stumper. But here are a few things that you can try that might help...

  1. try to build the docker container by itself, not using docker compose
  2. when building, to try to remove caching with --no-cache argument to docker
  3. make sure that you are running the newest version of Docker

huangapple
  • 本文由 发表于 2023年6月5日 18:17:12
  • 转载请务必保留本文链接:https://go.coder-hub.com/76405437.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定