英文:
How to save or log pytorch model using MLflow?
问题
I am in main.py
at the root directory at main.py
调用模型脚本以训练模型。目录看起来像这样
在训练完模型后,我计划使用MLflow保存和记录PyTorch模型。以下是代码
但是在保存代码路径时出现错误,提示找不到目录。
问题1:在我的情况下,是否需要保存代码路径和额外文件参数?
问题2:保存代码路径目录的正确方法是什么?
链接:https://mlflow.org/docs/latest/python_api/mlflow.pytorch.html
英文:
I am in main.py
at the root directory at main.py
calling the model script to train the model. The directory looks like this
After training the model, I am planning to save and log the PyTorch model using MLflow. Here’s the code
# Registering the model to the workspace
mlflow.pytorch.log_model(
pytorch_model= model,
registered_model_name="use-case1-model",
artifact_path="use-case1-model",
input_example=df[['Title', 'Attributes']],
conda_env=os.path.join("./dependencies", "conda.yaml"),
code_paths="./models"
]
)
# Saving the model to a file
mlflow.pytorch.save_model(
pytorch_model= model,
conda_env=os.path.join("./dependencies", "conda.yaml"),
input_example=df[['Title', 'Attributes']],
path=os.path.join(args.model, "use-case1-model"),
code_paths="./models"
)
But I am getting an error while saving the code paths, saying the directory is not found.
Question 1: is there a need to save the code paths and extra files parameter in my case?
Question 2: What's the right way to save the code paths directory?
https://mlflow.org/docs/latest/python_api/mlflow.pytorch.html
答案1
得分: 0
根据函数定义,参数 code_paths 用于提供本地文件系统路径列表,这些路径指向Python文件依赖项(或包含文件依赖项的目录)。
如果您的模型具有此类依赖关系,您需要将这些路径提供给 code_paths。
有关未找到目录的错误可以通过以下方式解决。
code_pth = os.path.abspath("") + "/media/model/"
conda_env = os.path.abspath("") + "/dependencies/"
print(conda_env)
print(code_pth)
我使用了sklearn模型进行日志记录和保存。
mlflow.sklearn.log_model(
sk_model=clf,
registered_model_name=registered_model_name,
artifact_path=registered_model_name,
code_paths=[code_pth],
conda_env=os.path.join(conda_env, "conda.yaml")
)
输出:
mlflow.sklearn.save_model(
sk_model=clf,
path=os.path.join(registered_model_name, "trained_model"),
code_paths=[code_pth],
conda_env=os.path.join(conda_env, "conda.yaml")
)
输出:
英文:
As per function definition, the parameter code_paths is for giving a list of local filesystem paths to Python file dependencies (or directories containing file dependencies).
If your model having such kind dependencies you need to provide there paths in list to code_paths.
The error you are getting about directory not found can resolved by taking abs path as below.
code_pth = os.path.abspath("")+"/media/model/"
conda_env = os.path.abspath("")+"/dependencies/"
print(conda_env)
print(code_pth)
I have used sklearn model to log and save.
mlflow.sklearn.log_model(
sk_model=clf,
registered_model_name=registered_model_name,
artifact_path=registered_model_name,
code_paths=[code_pth],
conda_env=os.path.join(conda_env, "conda.yaml")
)
Output:
mlflow.sklearn.save_model(
sk_model=clf,
path=os.path.join(registered_model_name, "trained_model"),
code_paths=[code_pth],
conda_env=os.path.join(conda_env, "conda.yaml")
)
Output:
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论