英文:
pdf2image fails in docker container
问题
我在一个运行在Docker容器中的Python项目中遇到了问题,无法使`convert_from_path`正常工作(来自`pdf2image`库)。在我的Windows PC上本地运行正常,但在基于Linux的Docker容器中却不行。每次都会出现错误,内容是`Unable to get page count. Is poppler installed and in PATH?`。我的代码相关部分如下:
英文:
I have a Python project running in a docker container, but I can't get convert_from_path
to work (from pdf2image
library). It works locally on my Windows PC, but not in the linux-based docker container.
The error I get each time is Unable to get page count. Is poppler installed and in PATH?
Relevant parts of my code look like this
from pdf2image import convert_from_path
import os
from sys import exit
def my_function(file_source_path):
try:
pages = convert_from_path(file_source_path, 600, poppler_path=os.environ.get('POPPLER_PATH'))
except Exception as e:
print('Fail 1')
print(e)
try:
pages = convert_from_path(file_source_path, 600)
except Exception as e:
print('Fail 2')
print(e)
try:
pages = convert_from_path(file_source_path, 600, poppler_path=r'\usr\local\bin')
except Exception as e:
print('Fail 3')
print(e)
print(os.environ)
exit('Exiting script')
In attempt 1 I try to reference the original file saved on windows. Basically the path refers to '/code/poppler'
which is a binded mount referring to
[snippet from docker-compose.yml]
- type: bind
source: "C:/Program Files/poppler-0.68.0/bin"
target: /code/poppler
In attempt 2 I just try to leave the path empty. In attempt 3 I tried something I found that worked from some other users locally.
Relevant parts of my Dockerfile look like this
FROM python:3.10
WORKDIR /code
# install poppler
RUN apt-get update
RUN apt-get install poppler-utils -y
COPY ./requirements.txt ./
RUN pip install --upgrade pip
RUN pip install --no-cache-dir -r requirements.txt
COPY . .
CMD ["python", "./app.py"]
答案1
得分: 0
以下是翻译的部分:
问题是我的Docker镜像没有正确地刷新,之后清除了构建缓存,再尝试使用上述的Dockerfile中间选项与工作。
因此,在Dockerfile中的RUN apt-get install poppler-utils -y
与代码中不引用路径 pages = convert_from_path(file_source_path, 600)
的组合将有效,因为在安装 poppler-utils
时会自动找到PATH
。
还可以从docker-compose.yml
和.env
文件中删除绑定的挂载。
英文:
So the issue was that my Docker image was not refreshing correctly and after nuking the build-cache and trying again the middle option worked combined with the above Dockerfile.
So a combination of RUN apt-get install poppler-utils -y
in the Dockerfile + not referencing the path in the code pages = convert_from_path(file_source_path, 600)
will work, as it will find the PATH
automatically when installing poppler-utils
.
The binded mount can also be removed from docker-compose.yml
and from the .env
file.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论