英文:
ModuleNotFoundError for 'sklearn' as subdependency of numpy
问题
I am using Docker combined with virtualenv to run a project for a client, but getting the error ModuleNotFound for sklearn.
In my Pipfile I have added the numpy dependency
numpy = "==1.21.6"
The error is thrown from the following line
np.load(PATH_TO_NPY_FILE, allow_pickle=True)
with the following stack trace:
development_1 | File "/root/.local/share/virtualenvs/code-_Py8Si6I/lib/python3.7/site-packages/numpy/lib/npyio.py", line 441, in load
development_1 | pickle_kwargs=pickle_kwargs)
development_1 | File "/root/.local/share/virtualenvs/code-_Py8Si6I/lib/python3.7/site-packages/numpy/lib/format.py", line 748, in read_array
development_1 | array = pickle.load(fp, **pickle_kwargs)
development_1 | ModuleNotFoundError: No module named 'sklearn'
I find this strange because sklearn
should be installed as part of the numpy dependency tree, right?
Still, I tried the suggestions I found in other posts, like adding the following command explicitly to my Dockerfile
python -m pip install scikit-learn scipy matplotlib
However, the error still persists.
For completeness, I'll provide some extra info below, although the key question remains why installing numpy does not imply its subdependencies to be in place.
Project structure
The project is sort of a bridge between SQS on one hand and the logic of the client on the other. The code from which the error is thrown comes from a git submodule, and the Pipfile is added on the top-level repo. The submodule does not contain a Pipfile. The submodules folder has an __init__.py
file because it contains functions that I want to use in my src code.
In the tree below, my code is in main.py
, and the error-throwing code is in submodules/module2/bar.py
.
|- src/
| |- main.py
|
|- submodules/
| |- module1
| | |- foo.py
| | |- setup.py
| |
| |- module2
| | |- bar.py
| |
| |- __init__.py
|
|- .gitmodules
|- Pipfile
|- Dockerfile
Dockerfile contents
Note that at this point, it is a bit of an aggregate of solutions I took from the other post on the matter. That's why both pip install scikit-learn
and apt-get install python3-sklearn
are currently included. Will prune later when I finally have fixed this issue.
FROM python:3.7
WORKDIR code/
COPY Pipfile .
COPY submodules/ submodules/
RUN pip install pipenv && \
pipenv install --deploy && \
python -m pip install scikit-learn scipy matplotlib && \
apt-get update && \
apt-get install -y locales ffmpeg libsm6 libxext6 libxrender-dev python3-sklearn && \
sed -i -e 's/# nl_BE.UTF-8 UTF-8/nl_BE.UTF-8 UTF-8/' /etc/locale.gen && \
dpkg-reconfigure --frontend=noninteractive locales
ENV LANG nl_BE.UTF-8
ENV LC_ALL nl_BE.UTF-8
COPY .env .
COPY src/ .
COPY data/ data
CMD [ "pipenv", "run", "python", "main.py" ]
Pipfile contents
[[source]]
url = "https://pypi.org/simple"
verify_ssl = true
name = "pypi"
[packages]
python-dotenv = "*"
boto3 = "*"
pySqsListener = "*"
xpress = "==9.0.5"
module1 = {path = "./submodules/module1"}
pandas = "==1.3.4"
numpy = "==1.21.6"
[dev-packages]
[requires]
python_version = "3.7"
英文:
I am using Docker combined with virtualenv to run a project for a client, but getting the error ModuleNotFound for sklearn.
In my Pipfile I have added the numpy dependency
numpy = "==1.21.6"
The error is thrown from the following line
np.load(PATH_TO_NPY_FILE, allow_pickle=True)
with the following stack trace:
development_1 | File "/root/.local/share/virtualenvs/code-_Py8Si6I/lib/python3.7/site-packages/numpy/lib/npyio.py", line 441, in load
development_1 | pickle_kwargs=pickle_kwargs)
development_1 | File "/root/.local/share/virtualenvs/code-_Py8Si6I/lib/python3.7/site-packages/numpy/lib/format.py", line 748, in read_array
development_1 | array = pickle.load(fp, **pickle_kwargs)
development_1 | ModuleNotFoundError: No module named 'sklearn'
I find this strange, because sklearn
should be installed as part of the numpy dependency tree, right?
Still I tried the suggestions I found in other posts, like adding the following command explicitly to my Dockerfile
python -m pip install scikit-learn scipy matplotlib
However, the error still persists.
For completeness, I'll provide some extra info below, although the key question remains why installing numpy does not imply its sub dependencies to be in place.
Project structure
The project is sort of a bridge between SQS on one hand and the logic of the client on the other. The code from which the error is thrown comes from a git submodule and the Pipfile is added on the top-level repo. The submodule does not contain a Pipfile. The submodules folder has an __init__.py
file because it contains functions that I want to use in my src code.
In the tree below, my code is in main.py
and the error throwing code is in submodules/module2/bar.py
.
|- src/
| |- main.py
|
|- submodules/
| |- module1
| | |- foo.py
| | |- setup.py
| |
| |- module2
| | |- bar.py
| |
| |- __init__.py
|
|- .gitmodules
|- Pipfile
|- Dockerfile
Dockerfile contents
Note that at this point, it is a bit of an aggregate of solutions I took from the other post on the matter. That's why both pip install scikit-learn
and apt-get install python3-sklearn
are currently included. Will prune later when I finally have fixed this issue.
FROM python:3.7
WORKDIR code/
COPY Pipfile .
COPY submodules/ submodules/
RUN pip install pipenv && \
pipenv install --deploy && \
python -m pip install scikit-learn scipy matplotlib && \
apt-get update && \
apt-get install -y locales ffmpeg libsm6 libxext6 libxrender-dev python3-sklearn && \
sed -i -e 's/# nl_BE.UTF-8 UTF-8/nl_BE.UTF-8 UTF-8/' /etc/locale.gen && \
dpkg-reconfigure --frontend=noninteractive locales
ENV LANG nl_BE.UTF-8
ENV LC_ALL nl_BE.UTF-8
COPY .env .
COPY src/ .
COPY data/ data
CMD [ "pipenv", "run", "python", "main.py" ]x
Pipfile contents
[[source]]
url = "https://pypi.org/simple"
verify_ssl = true
name = "pypi"
[packages]
python-dotenv = "*"
boto3 = "*"
pySqsListener = "*"
xpress = "==9.0.5"
module1 = {path = "./submodules/module1"}
pandas = "==1.3.4"
numpy = "==1.21.6"
[dev-packages]
[requires]
python_version = "3.7"
答案1
得分: 2
"I find this strange, because sklearn should be installed as part of the numpy dependency tree, right?"
不好意思,此部分无需翻译。
"No, scikit-learn
is not a dependency of numpy (it's the other way around)."
不好意思,此部分无需翻译。
"Since loading that pickle file requires sklearn
, just put scikit-learn
in the pipfile (and matplotlib
if you need it); it's a dependency of your project."
由于加载 pickle 文件需要 sklearn
,只需将 scikit-learn
放入 pipfile(如果需要,还可以加入 matplotlib
);它是您项目的依赖项。
"Installing them outside the pipenv
-generated environment with that python -m pip install scikit-learn scipy matplotlib
will likely have no effect (nor will the apt-installed python3-sklearn
), since virtualenvs are designed to be separate from the ambient environment."
在由 pipenv
生成的环境之外安装它们,使用 python -m pip install scikit-learn scipy matplotlib
,可能不会产生任何效果(apt 安装的 python3-sklearn
也不会),因为虚拟环境被设计为与环境分离。
英文:
> I find this strange, because sklearn should be installed as part of the numpy dependency tree, right?
No, scikit-learn
is not a dependency of numpy (it's the other way around).
Since loading that pickle file requires sklearn
, just put scikit-learn
in the pipfile (and matplotlib
if you need it); it's a dependency of your project.
Installing them outside the pipenv
-generated environment with that python -m pip install scikit-learn scipy matplotlib
will likely have no effect (nor will the apt-installed python3-sklearn
), since virtualenvs are designed to be separate from the ambient environment.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论