2023年2月6日 16:58:56go评论70阅读模式

英文:

Using code_path in mlflow.pyfunc models on Databricks

问题

我们在AWS基础设施上使用Databricks，在mlflow上注册模型。
我们将项目内的导入写为from src.(模块位置) import (对象)。

根据在线示例，我期望当我使用mlflow.pyfunc.log_model(..., code_path=['PROJECT_ROOT/src'], ...)时，它将向模型的运行环境添加整个代码树，从而使我们能够保持导入不变。

在记录模型时，我得到了一长串[Errno 95] Operation not supported，每个笔记本都有一个。这阻止我们将模型注册到mlflow。

我们已经使用了几种临时解决方案和变通方法，从强制自己将所有代码放在一个文件中，到只使用位于同一目录中的文件（code_path=['./filename.py']），到添加特定的库（并相应更改导入路径），等等。

然而，这些都不是最佳解决方案。结果要么是重复的代码（违反DRY原则），要么是将一些导入放在包装器内（即那些无法在我们的工作环境中运行的导入，因为它与模型在部署时将经历的环境不同），等等。

我们还没有尝试将所有笔记本（我们认为会导致[Errno 95] Operation not supported错误的笔记本）放在一个单独的文件夹中。这将对我们当前的情况和流程产生严重干扰，我们希望尽量避免这种情况。

请提供建议。

英文:

We are using Databricks over AWS infra, registering models on mlflow.
We write our in-project imports as from src.(module location) import (objects).

Following examples online, I expected that when I use mlflow.pyfunc.log_model(..., code_path=['PROJECT_ROOT/src'], ...), that would add the entire code tree to the model's running environment and thus allow us to keep our imports as-are.

When logging the model, I get a long list of [Errno 95] Operation not supported, one for each notebook in our repo. This blocks us from registering the model to mlflow.

We have used several ad-hoc solutions and workarounds, from forcing ourselves to work with all code in one file, to only working with files in the same directory (code_path=['./filename.py'], to adding specific libraries (and changing import paths accordingly), etc.

However none of these is optimal. As a result we either duplicate code (killing DRY), or we put some imports inside the wrapper (i.e. those that cannot be run in our working environment since it's different from the one the model will experience when deployed), etc.

We have not yet tried to put all the notebooks (which we believe cause [Errno 95] Operation not supported) in a separate folder. This will be highly disruptive to our current situation and processes, and we'd like to avoid that as much as we can.

Please advise

答案1

得分: 1

我在使用来自src目录的自定义模型逻辑（类似于cookiecutter-data-science的结构）时，与Databricks遇到了类似的问题。解决方案是使用相对路径记录整个src目录。

因此，如果您有以下项目结构：

.
├── notebooks
│    └── train.py
└── src
    ├── __init__.py
    └── model.py

您的train.py应该如下所示。 请注意，AddN来自MLflow文档。

import mlflow

from src.model import AddN

model = AddN(n=5)

mlflow.pyfunc.log_model(
    registered_model_name="add_n_model",
    artifact_path="add_n_model",
    python_model=model,
    code_path=["../src"],
)

这将复制src/中的所有代码并将其记录在MLflow存储中，从而使模型能够加载所有依赖项。

如果您未使用notebooks/目录，可以将code_path=["src"]。如果您使用了子目录，如notebooks/train/train.py，则可以将code_path=["../../src"]。

英文:

I had a similar struggle with Databricks when using custom model logic from an src directory (similar structure to cookiecutter-data-science). The solution was to log the entire src directory using the relative path.

So if you have the following project structure.

.
├── notebooks
│&#160;&#160; └── train.py
└── src
    ├── __init__.py
    └── model.py

Your train.py should look like this. Note AddN comes from the MLflow Docs.

import mlflow

from src.model import AddN

model = AddN(n=5)

mlflow.pyfunc.log_model(
    registered_model_name=&quot;add_n_model&quot;,
    artifact_path=&quot;add_n_model&quot;,
    python_model=model,
    code_path=[&quot;../src&quot;],
)

This will copy all code in src/ and log it in the MLflow artifact allowing the model to load all dependencies.

If you are not using a notebooks/ directory, you will set code_path=["src"]. If you are using sub-directies like notebooks/train/train.py, you will set code_path=["../../src"].

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

使用mlflow.pyfunc模型中的code_path在Databricks上。

问题

答案1

Databricks、存储帐户和VNet对等连接

将Pyspark Dataframe转换为字典不起作用。

Dataricks – 尽管文件存在，但无法设置证书验证位置。

可以在没有Databricks的情况下使用Delta Lake吗？

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论