执行使用我的本地包的脚本 – 导入错误

huangapple go评论73阅读模式
英文:

Execute script that uses my local package - ImportErrors

问题

我的项目在一个 Kubernetes 容器中运行了一段时间,直到我决定 "清理" 我在模块顶部使用的 sys.add 调用。这包括在 pyproject.toml 中描述我的依赖关系,并彻底放弃了 setup.py;它导入了安装工具,并在 __main__ 时调用 setup()

设计意图是不将任何内容作为脚本运行在 /tnc/app 中。相反,它是一组模块或一个包。代码库中唯一作为 __main__ 的部分是 api.py 文件。它初始化并启动 Flask。

实现

我有一个精简的部署设置,包括以下内容:

  1. 核心库位于 /opt/venv
  2. 我的包 /app/tnc
  3. 入口点 /app/bin/api

我使用以下命令启动 Flask 应用程序:python /app/bin/api

构建过程发生在 python:3.11-slim Docker 镜像中。在这里,我安装了建议使用的 gcc,并在 Dockerfile 中指定以下内容:

-- build
RUN python -m venv /opt/venv
ENV PATH="/opt/venv/bin:$PATH"
COPY pyproject.toml project.toml
RUN pip3 install -e .  # 除了更好的方式是使用 python -m pip3 install -e .

然后,我从构建中将以下内容复制到我的运行时镜像中。

-- runtime
ENV PATH "/opt/venv/bin:$PATH"
ENV PYTHONPATH "/opt/venv/bin:/app/tnc"
COPY --chown=appuser:appuser bin bin
COPY --chown=appuser:appuser tnc tnc
COPY --chown=appuser:appuser config.py config.py
COPY --from=builder /opt/venv/ /opt/venv

正如我之前提到的,在 Kubernetes 部署中,我使用以下方式启动容器:

command: ["python3"]
args: ["bin/api"]

寻找解决方案时的观察

以可以运行 Python REPL 的方式启动容器时:

  • import flask 生成 AttributeError ...replace(' -> None', '')
  • PYTHONPATH移除 /app/tnc 后,import flask 生成 ModuleNotFoundError ... no tnc

AttributeError ...replace(' -> None', '')

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/opt/venv/lib/python3.10/site-packages/werkzeug/__init__.py", line 2, in <module>
    from .test import Client as Client
  File "/opt/venv/lib/python3.10/site-packages/werkzeug/test.py", line 35, in <module>
    from .sansio.multipart import Data
  File "/opt/venv/lib/python3.10/site-packages/werkzeug/sansio/multipart.py", line 19, in <module>
    class Preamble(Event):
  File "/usr/local/lib/python3.10/dataclasses.py", line 1175, in wrap
    return _process_class(cls, init, repr, eq, order, unsafe_hash,
  File "/usr/local/lib/python3.10/dataclasses.py", line 1093, in _process_class
    str(inspect.signature(cls)).replace(' -> None', ''))
AttributeError: module 'inspect' has no attribute 'signature'

ModuleNotFoundError: No module named 'tnc'

appuser@tnc-py-deployment-set-1:/app$ echo $PYTHONPATH
/opt/venv/bin
appuser@tnc-py-deployment-set-1:/app$ echo $PATH
/opt/venv/bin:/usr/local/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
appuser@tnc-py-deployment-set-1:/app$ python -m /app/bin/api
/opt/venv/bin/python: No module named /app/bin/api
appuser@tnc-py-deployment-set-1:/app$ python /app/bin/api
Traceback (most recent call last):
  File "/app/bin/api", line 12, in <module>
    from tnc.s3 import S3Session
ModuleNotFoundError: No module named 'tnc'

项目结构

├── bin
│   └── api
├── config.py
├── pyproject.toml
└── tnc
    ├── __init__.py
    ├── data
    │   ├── __init__.py
    │   ├── download.py
    │   ├── field_types.py
    │   └── storage_providers
    ├── errors.py
    ├── inspect
    │   ├── __init__.py
    │   └── etl_time_index.py
    ├── test
    │   ├── __init__.py
    │   └── test_end-to-end.py
    ├── utils.py
    └── www
        ├── __init__.py
        └── routes
            ├── __init__.py
            ├── feedback.py
            ├── livez.py
            └── utils.py

pyproject.toml

[build-system]
requires = ["setuptools"]
build-backend = "setuptools.build_meta"

[tool.setuptools.packages.find]
where = ["./"]
exclude = [ "res", "notes" ]

dependencies = [ ... with version specs ]
英文:

My project was up-and-running for a while running in a kubernetes container... until, I decided to "clean-up" my use of the sys.add calls that I had at the top of my modules. This included describing my dependencies in pyproject.toml, and all-together ditching setup.py; it imported setup tools, called setup() when __main__.

The design intent is not to run anything in /tnc/app as a script. But rather, a collection of modules, or a package. The only part of the codebase that serves as a __main__ is the api.py file. It initializes and fires-up flask.

Implementation

I have a lean deployment setup that consists of the following:

  1. the core library in /opt/venv
  2. my package /app/tnc
  3. and the entry point /app/bin/api

I kick-off the flask app with: python /app/bin/api.

The build takes place in the python:3.11-slim docker image. Here I install the recommended gcc and specify the following in the dockerfile:

-- build
RUN python -m venv /opt/venv
ENV PATH=&quot;/opt/venv/bin:$PATH&quot;
COPY pyproject.toml project.toml
RUN pip3 install -e .  -- &lt;&lt; aside: better would be to use python -m pip3 install -e .

I then copy the following from the build into my runtime image.

-- runtime
ENV PATH &quot;/opt/venv/bin:$PATH&quot;
ENV PYTHONPATH &quot;/opt/venv/bin:/app/tnc&quot;
COPY --chown=appuser:appuser bin bin
COPY --chown=appuser:appuser tnc tnc
COPY --chown=appuser:appuser config.py config.py
COPY --from=builder /opt/venv/ /opt/venv

As I mentioned, in the kubernetes deployment I fire-up the container with:

command: [&quot;python3&quot;]
args: [&quot;bin/api&quot;]

My observations working to find the solution

Firing up the container in such a way that I can run the python REPL:

  • import flask generates AttributeError ...replace(&#39; -&gt; None&#39;, &#39;&#39;)
  • remove /app/tnc from the PYTHONPATH, import flask generates ModuleNotFound ... no tnc

AttributeError ...replace(&#39; -&gt; None&#39;, &#39;&#39;)

Traceback (most recent call last):
  File &quot;&lt;stdin&gt;&quot;, line 1, in &lt;module&gt;
  File &quot;/opt/venv/lib/python3.10/site-packages/werkzeug/__init__.py&quot;, line 2, in &lt;module&gt;
    from .test import Client as Client
  File &quot;/opt/venv/lib/python3.10/site-packages/werkzeug/test.py&quot;, line 35, in &lt;module&gt;
    from .sansio.multipart import Data
  File &quot;/opt/venv/lib/python3.10/site-packages/werkzeug/sansio/multipart.py&quot;, line 19, in &lt;module&gt;
    class Preamble(Event):
  File &quot;/usr/local/lib/python3.10/dataclasses.py&quot;, line 1175, in wrap
    return _process_class(cls, init, repr, eq, order, unsafe_hash,
  File &quot;/usr/local/lib/python3.10/dataclasses.py&quot;, line 1093, in _process_class
    str(inspect.signature(cls)).replace(&#39; -&gt; None&#39;, &#39;&#39;))
AttributeError: module &#39;inspect&#39; has no attribute &#39;signature&#39;

ModuleNotFoundError: No module named &#39;tnc&#39;

appuser@tnc-py-deployment-set-1:/app$ echo $PYTHONPATH
/opt/venv/bin
appuser@tnc-py-deployment-set-1:/app$ echo $PATH
/opt/venv/bin:/usr/local/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
appuser@tnc-py-deployment-set-1:/app$ python -m /app/bin/api
/opt/venv/bin/python: No module named /app/bin/api
appuser@tnc-py-deployment-set-1:/app$ python /app/bin/api
Traceback (most recent call last):
  File &quot;/app/bin/api&quot;, line 12, in &lt;module&gt;
    from tnc.s3 import S3Session
ModuleNotFoundError: No module named &#39;tnc&#39;

The project structure

├── bin
│&#160;&#160; └── api
├── config.py
├── pyproject.toml
└── tnc
    ├── __init__.py
    ├── data
    │&#160;&#160; ├── __init__.py
    │&#160;&#160; ├── download.py
    │&#160;&#160; ├── field_types.py
    │&#160;&#160; └── storage_providers
    ├── errors.py
    ├── inspect
    │&#160;&#160; ├── __init__.py
    │&#160;&#160; └── etl_time_index.py
    ├── test
    │&#160;&#160; ├── __init__.py
    │&#160;&#160; └── test_end-to-end.py
    ├── utils.py
    └── www
        ├── __init__.py
        └── routes
            ├── __init__.py
            ├── feedback.py
            ├── livez.py
            └── utils.py

pyproject.toml

[build-system]
requires = [&quot;setuptools&quot;]
build-backend = &quot;setuptools.build_meta&quot;

[tool.setuptools.packages.find]
where = [&quot;./&quot;]
exclude = [ &quot;res&quot;, &quot;notes&quot; ]

dependencies = [ ... with version specs ]

答案1

得分: 0

首先,我必须向pyproject.toml + setuptools团队大声喊话:文档和实现已经变得很好。它让我能够更加具体和“确定性”:)) 关于我的设置。更不用说,在构建过程中更加积极。

修复“未找到”错误

修复包括以下内容:

  1. 使用以下内容更新了pyproject.toml
[tool.setuptools.package-dir]
tnc = "tnc"
bin = "bin"

# 入口点(不是必需的,但是符合人体工程学)
[project.scripts]
run-api = "bin.api:main"

我包括了一个__init__来标记每个子模块。

  1. 也许不是必需的,但我将config.py文件移到了bin目录中。这个位置捕捉到了我的设计意图。对api.py文件的更改...
# 使用对config.py的字符串引用实例化config对象
app.config.from_object("bin.config.DevelopmentConfig")
...

# 添加了一个main()函数,以启用指定入口点的选项
if __name__ == '__main__':
    logging.basicConfig(level=logging.DEBUG)
    app.run(host=app.config['HOST'], port=app.config['PORT'])

def main():
    """如果使用入口点脚本"""
    logging.basicConfig(level=logging.DEBUG)
    app.run(host=app.config['HOST'], port=app.config['PORT'])
  1. 在Dockerfile中,我将PYTHONPATH环境值设置为"/app",这是"tnc"和"bin"目录的位置。这绝不是最佳实践,但在这种情况下,考虑到我要将"bin"与"tnc"分开,这似乎是唯一有意义的方式。这个用例似乎是正确的方式。

改进的构建过程

最后,虽然有一些众所周知的技巧可以在构建Docker镜像时最大程度地重用缓存,但我想强调使用最新的setuptools配置的pyproject.toml时,了解构建过程的确切情况有多容易。

A. 首先使用空存根运行构建非常简单,该存根最终将包含应用程序代码的位置。

# pyproject.dependencies.toml
packages = ["tnc"]

...与2阶段构建配对(镜像是官方的Docker Python镜像)

# 确保使用Python基础映像中的虚拟环境:
RUN python -m venv /opt/venv
ENV PATH="/opt/venv/bin:$PATH"

# 阶段1:使用空项目目录进行依赖项构建
COPY pyproject.dependencies.toml pyproject.toml
RUN mkdir tnc
RUN pip3 install .

# 阶段2:完整和最终构建
COPY bin bin
COPY tnc tnc
COPY pyproject.toml pyproject.toml
RUN pip3 install .

B. 从现在已经合并的构建工件中清楚地看出,需要复制到用于分发的镜像中

COPY --from=builder --chown=appuser:appuser /app/build/lib/tnc tnc
COPY --from=builder --chown=appuser:appuser /app/build/lib/bin bin
COPY --from=builder --chown=root:root /opt/venv/ /opt/venv

在Kube部署中,尽管可以使用pyproject.toml配置的入口点调用,但我选择调用api.py作为脚本。

# 在用于镜像分发的kube部署中
command: ["python"]
args: ["/app/bin/api.py"]

结论

我有了一个改进的设计,不再包括对sys.path的“临时”调用,也不再“污染”PYTHONPATH。我现在只有一个入口点,即/app,这传达了一个重要的设计选择:希望将入口点放在一个单独的根目录中。

英文:

First, I have to shout-out to the pyproject.toml + setuptools team: the documentation and implementation has gotten good. It allowed me to get a lot more specific and "deterministic" :)) about my setup. Not to mention, a bit more aggressive in the build process.

Fixing the "not found" errors

The fix included the following:

  1. updated the pyproject.toml with the following
[tool.setuptools.package-dir]
tnc = &quot;tnc&quot;
bin = &quot;bin&quot;

# entry point (not required but is ergonomic)
[project.scripts]
run-api = &quot;bin.api:main&quot;

I included a __init__ to mark each submodule.

  1. Perhaps not required, but I moved the config.py file into the bin directory. This location captured my design intent. Changes to the api.py file...
# instantiate the config object using a string ref to the config.py
app.config.from_object(&quot;bin.config.DevelopmentConfig&quot;)
...

# added a def main() to enable the option of specifying an entry point
if __name__ == &#39;__main__&#39;:
    logging.basicConfig(level=logging.DEBUG)
    app.run(host=app.config[&#39;HOST&#39;], port=app.config[&#39;PORT&#39;])

def main():
    &quot;&quot;&quot; if using entrypoint script &quot;&quot;&quot;
    logging.basicConfig(level=logging.DEBUG)
    app.run(host=app.config[&#39;HOST&#39;], port=app.config[&#39;PORT&#39;])
  1. In the Dockerfile I set the PYTHONPATH env value to "/app", the location of the tnc and bin directories. By no means a best practice, but in this case, given my determination to have bin separate from tnc, the only way that made sense. This use case seemed the right way to go.

Improved build process

Finally, while there are a few well known techniques to maximize the reuse of the cache when building the docker image, I wanted to call out how easy it was to know precisely what was going on during the build, made possible by the latest setuptool configured with pyproject.toml.

A. It was trivial to first run the build using empty stub for where the app code would eventually go.

# pyproject.dependencies.toml
packages = [&quot;tnc&quot;]

... paired with the 2 phased build (the image is an official docker python image)

# Make sure to use the venv from the python base img:
RUN python -m venv /opt/venv
ENV PATH=&quot;/opt/venv/bin:$PATH&quot;

# phase 1: dependency build using an empty project dir
COPY pyproject.dependencies.toml pyproject.toml
RUN mkdir tnc
RUN pip3 install .

# phase 2: full and final build
COPY bin bin
COPY tnc tnc
COPY pyproject.toml pyproject.toml
RUN pip3 install .

B. It was clear what to copy from the now consolidated build artifacts, into my image used for distribution

COPY --from=builder --chown=appuser:appuser /app/build/lib/tnc tnc
COPY --from=builder --chown=appuser:appuser /app/build/lib/bin bin
COPY --from=builder --chown=root:root /opt/venv/ /opt/venv

In the kube deployment, despite being able call the entry point configured using pyproject.toml, I chose to call the api.py as a script.

# in the kube deployment for the image
command: [&quot;python&quot;]
args: [&quot;/app/bin/api.py&quot;]

Conclusion

I have an improved design that no longer includes "ad-hoc" calls to sys.path, nor resorts to &quot;polluting&quot; the PYTHONPATH. The single entry I now have, /app`, conveys an important design choice: wanting to have the entry point be in a separate root directory.

huangapple
  • 本文由 发表于 2023年7月31日 22:38:40
  • 转载请务必保留本文链接:https://go.coder-hub.com/76804674.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定