ModuleNotFoundError for 'sklearn' as subdependency of numpy

huangapple go评论113阅读模式
英文:

ModuleNotFoundError for 'sklearn' as subdependency of numpy

问题

I am using Docker combined with virtualenv to run a project for a client, but getting the error ModuleNotFound for sklearn.

In my Pipfile I have added the numpy dependency

  1. numpy = "==1.21.6"

The error is thrown from the following line

  1. np.load(PATH_TO_NPY_FILE, allow_pickle=True)

with the following stack trace:

  1. development_1 | File "/root/.local/share/virtualenvs/code-_Py8Si6I/lib/python3.7/site-packages/numpy/lib/npyio.py", line 441, in load
  2. development_1 | pickle_kwargs=pickle_kwargs)
  3. development_1 | File "/root/.local/share/virtualenvs/code-_Py8Si6I/lib/python3.7/site-packages/numpy/lib/format.py", line 748, in read_array
  4. development_1 | array = pickle.load(fp, **pickle_kwargs)
  5. development_1 | ModuleNotFoundError: No module named 'sklearn'

I find this strange because sklearn should be installed as part of the numpy dependency tree, right?

Still, I tried the suggestions I found in other posts, like adding the following command explicitly to my Dockerfile

  1. python -m pip install scikit-learn scipy matplotlib

However, the error still persists.

For completeness, I'll provide some extra info below, although the key question remains why installing numpy does not imply its subdependencies to be in place.


Project structure

The project is sort of a bridge between SQS on one hand and the logic of the client on the other. The code from which the error is thrown comes from a git submodule, and the Pipfile is added on the top-level repo. The submodule does not contain a Pipfile. The submodules folder has an __init__.py file because it contains functions that I want to use in my src code.
In the tree below, my code is in main.py, and the error-throwing code is in submodules/module2/bar.py.

  1. |- src/
  2. | |- main.py
  3. |
  4. |- submodules/
  5. | |- module1
  6. | | |- foo.py
  7. | | |- setup.py
  8. | |
  9. | |- module2
  10. | | |- bar.py
  11. | |
  12. | |- __init__.py
  13. |
  14. |- .gitmodules
  15. |- Pipfile
  16. |- Dockerfile

Dockerfile contents

Note that at this point, it is a bit of an aggregate of solutions I took from the other post on the matter. That's why both pip install scikit-learn and apt-get install python3-sklearn are currently included. Will prune later when I finally have fixed this issue.

  1. FROM python:3.7
  2. WORKDIR code/
  3. COPY Pipfile .
  4. COPY submodules/ submodules/
  5. RUN pip install pipenv && \
  6. pipenv install --deploy && \
  7. python -m pip install scikit-learn scipy matplotlib && \
  8. apt-get update && \
  9. apt-get install -y locales ffmpeg libsm6 libxext6 libxrender-dev python3-sklearn && \
  10. sed -i -e 's/# nl_BE.UTF-8 UTF-8/nl_BE.UTF-8 UTF-8/' /etc/locale.gen && \
  11. dpkg-reconfigure --frontend=noninteractive locales
  12. ENV LANG nl_BE.UTF-8
  13. ENV LC_ALL nl_BE.UTF-8
  14. COPY .env .
  15. COPY src/ .
  16. COPY data/ data
  17. CMD [ "pipenv", "run", "python", "main.py" ]

Pipfile contents

  1. [[source]]
  2. url = "https://pypi.org/simple"
  3. verify_ssl = true
  4. name = "pypi"
  5. [packages]
  6. python-dotenv = "*"
  7. boto3 = "*"
  8. pySqsListener = "*"
  9. xpress = "==9.0.5"
  10. module1 = {path = "./submodules/module1"}
  11. pandas = "==1.3.4"
  12. numpy = "==1.21.6"
  13. [dev-packages]
  14. [requires]
  15. python_version = "3.7"
英文:

I am using Docker combined with virtualenv to run a project for a client, but getting the error ModuleNotFound for sklearn.

In my Pipfile I have added the numpy dependency

  1. numpy = "==1.21.6"

The error is thrown from the following line

  1. np.load(PATH_TO_NPY_FILE, allow_pickle=True)

with the following stack trace:

  1. development_1 | File "/root/.local/share/virtualenvs/code-_Py8Si6I/lib/python3.7/site-packages/numpy/lib/npyio.py", line 441, in load
  2. development_1 | pickle_kwargs=pickle_kwargs)
  3. development_1 | File "/root/.local/share/virtualenvs/code-_Py8Si6I/lib/python3.7/site-packages/numpy/lib/format.py", line 748, in read_array
  4. development_1 | array = pickle.load(fp, **pickle_kwargs)
  5. development_1 | ModuleNotFoundError: No module named 'sklearn'

I find this strange, because sklearn should be installed as part of the numpy dependency tree, right?

Still I tried the suggestions I found in other posts, like adding the following command explicitly to my Dockerfile

  1. python -m pip install scikit-learn scipy matplotlib

However, the error still persists.

For completeness, I'll provide some extra info below, although the key question remains why installing numpy does not imply its sub dependencies to be in place.


Project structure

The project is sort of a bridge between SQS on one hand and the logic of the client on the other. The code from which the error is thrown comes from a git submodule and the Pipfile is added on the top-level repo. The submodule does not contain a Pipfile. The submodules folder has an __init__.py file because it contains functions that I want to use in my src code.
In the tree below, my code is in main.py and the error throwing code is in submodules/module2/bar.py.

  1. |- src/
  2. | |- main.py
  3. |
  4. |- submodules/
  5. | |- module1
  6. | | |- foo.py
  7. | | |- setup.py
  8. | |
  9. | |- module2
  10. | | |- bar.py
  11. | |
  12. | |- __init__.py
  13. |
  14. |- .gitmodules
  15. |- Pipfile
  16. |- Dockerfile

Dockerfile contents

Note that at this point, it is a bit of an aggregate of solutions I took from the other post on the matter. That's why both pip install scikit-learn and apt-get install python3-sklearn are currently included. Will prune later when I finally have fixed this issue.

  1. FROM python:3.7
  2. WORKDIR code/
  3. COPY Pipfile .
  4. COPY submodules/ submodules/
  5. RUN pip install pipenv && \
  6. pipenv install --deploy && \
  7. python -m pip install scikit-learn scipy matplotlib && \
  8. apt-get update && \
  9. apt-get install -y locales ffmpeg libsm6 libxext6 libxrender-dev python3-sklearn && \
  10. sed -i -e 's/# nl_BE.UTF-8 UTF-8/nl_BE.UTF-8 UTF-8/' /etc/locale.gen && \
  11. dpkg-reconfigure --frontend=noninteractive locales
  12. ENV LANG nl_BE.UTF-8
  13. ENV LC_ALL nl_BE.UTF-8
  14. COPY .env .
  15. COPY src/ .
  16. COPY data/ data
  17. CMD [ "pipenv", "run", "python", "main.py" ]x

Pipfile contents

  1. [[source]]
  2. url = "https://pypi.org/simple"
  3. verify_ssl = true
  4. name = "pypi"
  5. [packages]
  6. python-dotenv = "*"
  7. boto3 = "*"
  8. pySqsListener = "*"
  9. xpress = "==9.0.5"
  10. module1 = {path = "./submodules/module1"}
  11. pandas = "==1.3.4"
  12. numpy = "==1.21.6"
  13. [dev-packages]
  14. [requires]
  15. python_version = "3.7"

答案1

得分: 2

"I find this strange, because sklearn should be installed as part of the numpy dependency tree, right?"

不好意思,此部分无需翻译。

"No, scikit-learn is not a dependency of numpy (it's the other way around)."

不好意思,此部分无需翻译。

"Since loading that pickle file requires sklearn, just put scikit-learn in the pipfile (and matplotlib if you need it); it's a dependency of your project."

由于加载 pickle 文件需要 sklearn,只需将 scikit-learn 放入 pipfile(如果需要,还可以加入 matplotlib);它是您项目的依赖项。

"Installing them outside the pipenv-generated environment with that python -m pip install scikit-learn scipy matplotlib will likely have no effect (nor will the apt-installed python3-sklearn), since virtualenvs are designed to be separate from the ambient environment."

在由 pipenv 生成的环境之外安装它们,使用 python -m pip install scikit-learn scipy matplotlib,可能不会产生任何效果(apt 安装的 python3-sklearn 也不会),因为虚拟环境被设计为与环境分离。

英文:

> I find this strange, because sklearn should be installed as part of the numpy dependency tree, right?

No, scikit-learn is not a dependency of numpy (it's the other way around).

Since loading that pickle file requires sklearn, just put scikit-learn in the pipfile (and matplotlib if you need it); it's a dependency of your project.

Installing them outside the pipenv-generated environment with that python -m pip install scikit-learn scipy matplotlib will likely have no effect (nor will the apt-installed python3-sklearn), since virtualenvs are designed to be separate from the ambient environment.

huangapple
  • 本文由 发表于 2023年4月13日 16:11:05
  • 转载请务必保留本文链接:https://go.coder-hub.com/76003126.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定