How to define Sagemaker Estimator with entry_point and source_dir once you have your own python package( having setup.py in your root)

huangapple go评论63阅读模式
英文:

How to define Sagemaker Estimator with entry_point and source_dir once you have your own python package( having setup.py in your root)

问题

我的代码结构如下:

|-my_directory
|----- README.md
|----- setup.py
|----- src
|---------- my_train_script.py
|---------- __init__.py
|----- requirements.txt

我想为训练步骤定义SageMaker估算器。如果我将"my_directory"作为source_dir并将"src/my_train_script.py"作为entry_point传递,我会收到错误消息,指示找不到模块src/my_train_script。如果我们将my_train_script.py移动到根目录并将entry_point设置为my_train_script.py,或者从my目录中删除setup.py,代码将正常工作。这不是最佳解决方案,我希望保留setup.py以供其他用途,是否有正确的方式来定义估算器?

估算器的示例(Tensorflow):

TensorFlow(
    entry_point="src/my_train_script.py",         
    source_dir="my_directory", 

    role=get_execution_role(),
    instance_count=1, 
    instance_type="ml.m5.2xlarge",
    framework_version="2.10.1",
    py_version="py39",
    debugger_hook_config=None,
    disable_profiler=True,
    base_job_name="base_job_name",  
)

如果我将"my_directory"作为source_dir并将"src/my_train_script.py"作为entry_point传递,我会收到错误消息,指示找不到模块src/my_train_script。如果我们将my_train_script.py移动到根目录并将entry_point设置为my_train_script.py,或者从my目录中删除setup.py,代码将正常工作。

英文:

My code structure is like this:

|-my_directory
|----- README.md
|----- setup.py
|----- src
|---------- my_train_script.py
|---------- __init__.py
|----- requirements.txt

I want to define sagemaker estimator for training step. If I pass "my_directory" as source_dir and
"src/my_train_script.py" as entry_point, I get error saying No module named src/my_train_script
The code work fine if we move my_train_script.py under to root and entry_point=my_train_script.py or we remove setup.py from my directory.
This is not the optimal solution, I want to keep the setup.py for other purposes, is there a right way to define the estimator ?

Example of estimator (Tensorflow)

TensorFlow(
    entry_point="src/my_train_script.py",         
    source_dir="my_directory", 

    role=get_execution_role(),
    instance_count=1, 
    instance_type="ml.m5.2xlarge",
    framework_version="2.10.1",
    py_version="py39",
    debugger_hook_config=None,
    disable_profiler=True,
    base_job_name="base_job_name",  
)

I want to define sagemaker estimator for training step. If I pass "my_directory" as source_dir and
"src/my_train_script.py" as entry_point, I get error saying No module named src/my_train_script
The code work fine if we move my_train_script.py under to root and entry_point=my_train_script.py or we remove setup.py from my directory.

答案1

得分: 1

根据官方文档的说法:

entry_point(str或PipelineVariable) -

应该作为训练的入口点执行的本地Python源文件的绝对或相对路径。(默认值:无)。 如果指定了source_dir,那么entry_point必须指向位于source_dir根目录下的文件。如果提供了'git_config','entry_point'应该是Git存储库中Python源文件的相对位置。

setup.py文件与训练脚本保持在根目录一起有什么问题?


您可以尝试以以下方式重新考虑文件夹结构:

|-my_directory
|----- README.md
|----- my_train_script.py
|----- utils
|---------- setup.py
|---------- __init__.py
|----- requirements.txt

当然,在my_train_script.py中,您可以使用以下方式调用setup

from utils import setup
英文:

As the official documentation says:

> entry_point (str or PipelineVariable) –
>
> The absolute or relative path to the local Python source file that
> should be executed as the entry point to training. (Default: None). If
> source_dir is specified, then entry_point must point to a file located
> at the root of source_dir
. If ‘git_config’ is provided, ‘entry_point’
> should be a relative location to the Python source file in the Git
> repo.

What is the problem with keeping the setup.py file at root level together with the training script?


You can try rethinking the structure of your folder in this way:

|-my_directory
|----- README.md
|----- my_train_script.py
|----- utils
|---------- setup.py
|---------- __init__.py
|----- requirements.txt

and of course inside my_train_script.py you can call setup with

from utils import setup

huangapple
  • 本文由 发表于 2023年2月10日 06:33:11
  • 转载请务必保留本文链接:https://go.coder-hub.com/75405135.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定