如何使用MLflow保存或记录PyTorch模型?

huangapple go评论72阅读模式
英文:

How to save or log pytorch model using MLflow?

问题

I am in main.py at the root directory at main.py 调用模型脚本以训练模型。目录看起来像这样

在训练完模型后,我计划使用MLflow保存和记录PyTorch模型。以下是代码

但是在保存代码路径时出现错误,提示找不到目录。

问题1:在我的情况下,是否需要保存代码路径和额外文件参数?

问题2:保存代码路径目录的正确方法是什么?

链接:https://mlflow.org/docs/latest/python_api/mlflow.pytorch.html

英文:

I am in main.py at the root directory at main.py calling the model script to train the model. The directory looks like this

如何使用MLflow保存或记录PyTorch模型?

After training the model, I am planning to save and log the PyTorch model using MLflow. Here’s the code

# Registering the model to the workspace
    mlflow.pytorch.log_model(
        pytorch_model= model,
        registered_model_name="use-case1-model",
        artifact_path="use-case1-model",
        input_example=df[['Title', 'Attributes']],
        conda_env=os.path.join("./dependencies", "conda.yaml"),
        code_paths="./models"
        ]

    )

    # Saving the model to a file
    mlflow.pytorch.save_model(
        pytorch_model= model,
        conda_env=os.path.join("./dependencies", "conda.yaml"),
        input_example=df[['Title', 'Attributes']],
        path=os.path.join(args.model, "use-case1-model"),
        code_paths="./models"
    )

But I am getting an error while saving the code paths, saying the directory is not found.

Question 1: is there a need to save the code paths and extra files parameter in my case?

Question 2: What's the right way to save the code paths directory?

https://mlflow.org/docs/latest/python_api/mlflow.pytorch.html

答案1

得分: 0

根据函数定义,参数 code_paths 用于提供本地文件系统路径列表,这些路径指向Python文件依赖项(或包含文件依赖项的目录)。

如果您的模型具有此类依赖关系,您需要将这些路径提供给 code_paths

有关未找到目录的错误可以通过以下方式解决。

code_pth = os.path.abspath("") + "/media/model/"
conda_env = os.path.abspath("") + "/dependencies/"
print(conda_env)
print(code_pth)

我使用了sklearn模型进行日志记录和保存。

mlflow.sklearn.log_model(
    sk_model=clf,
    registered_model_name=registered_model_name,
    artifact_path=registered_model_name,
    code_paths=[code_pth],
    conda_env=os.path.join(conda_env, "conda.yaml")
)

输出:
如何使用MLflow保存或记录PyTorch模型?

如何使用MLflow保存或记录PyTorch模型?

mlflow.sklearn.save_model(
    sk_model=clf,
    path=os.path.join(registered_model_name, "trained_model"),
    code_paths=[code_pth],
    conda_env=os.path.join(conda_env, "conda.yaml")
)

输出:

如何使用MLflow保存或记录PyTorch模型?

英文:

As per function definition, the parameter code_paths is for giving a list of local filesystem paths to Python file dependencies (or directories containing file dependencies).

If your model having such kind dependencies you need to provide there paths in list to code_paths.

The error you are getting about directory not found can resolved by taking abs path as below.

code_pth = os.path.abspath("")+"/media/model/"
conda_env = os.path.abspath("")+"/dependencies/"
print(conda_env)
print(code_pth)

如何使用MLflow保存或记录PyTorch模型?

I have used sklearn model to log and save.

mlflow.sklearn.log_model(
sk_model=clf,
registered_model_name=registered_model_name,
artifact_path=registered_model_name,
code_paths=[code_pth],
conda_env=os.path.join(conda_env, "conda.yaml")
)

Output:
如何使用MLflow保存或记录PyTorch模型?

如何使用MLflow保存或记录PyTorch模型?

mlflow.sklearn.save_model(
sk_model=clf,
path=os.path.join(registered_model_name, "trained_model"),
code_paths=[code_pth],
conda_env=os.path.join(conda_env, "conda.yaml")
)

Output:

如何使用MLflow保存或记录PyTorch模型?

huangapple
  • 本文由 发表于 2023年5月24日 17:54:01
  • 转载请务必保留本文链接:https://go.coder-hub.com/76322238.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定