Sagemaker -托管容器 – 脚本模式

huangapple go评论87阅读模式
英文:

Sagemaker -managed containers - script mode

问题

Question:
在使用SKLearn托管容器的脚本模式时,是否可以手动创建模型、端点配置和端点,而不是使用deploy命令自动创建?

Problem-1:
当我请求端点预测仅包含一行输入数据时,调用是正常的。然而,当我尝试仅预测一行输入时,我收到一个CloudWatch错误,提示输入形状不正确。这让我很困扰。请注意,这可能很重要,我的脚本中只有一个model_fn函数。我尚未开始包含input_fn、predict_fn和output_fn函数。这可能是问题的原因吗?

Problem-2:
当我尝试在本地训练时,出现以下错误消息:

"'docker-compose' is not installed. Local Mode features will not work without docker-compose. For more information on how to install 'docker-compose', please, see https://docs.docker.com/compose/install/"

我在任何教程中都没有遇到过这个问题。这正常吗?

提前感谢您的回答。

英文:

Newbie at Sagemaker. I am trying my hand at the SKLearn managed container and bringing in my own code in script mode. I train my model as follows

sklearn=SKLearn(
    entry_point='skl_iris.py',
    framework_version='1.2-1',
    instance_type='ml.m5.large',
    #instance_type='local',
    role=myrole,
    sagemaker_session=mysess,
    hyperparameters={'learning_rate': 0.005}
    )

sklearn.fit(inputs={'train': train_input})

I deploy the trained model

skl_predictor = sklearn.deploy(
                        initial_instance_count=1, 
                        instance_type='ml.m5.large'
                        )

and get the predictions by invoking the automatically created endpoint

sgm_runtime = boto3.client('runtime.sagemaker')
inputX=dftest.iloc[0:2,1:]
body=inputX.to_csv(header=False, index=False)
body=json.dumps(body)

response=sgm_runtime.invoke_endpoint(
    EndpointName=predictor.endpoint_name, 
    ContentType="text/csv",
    Body=body
    )

response=response['Body'].read().decode('utf-8')
print(response)

I have one question and 2 problems:

Question:
Is it possible in script mode and with the SKLearn managed container, to create a model,endpoint config and endpoint instead of having that done automatically with the deploy command?

Problem-1:
The invocation is fine when I ask the endpoint to predict for two or more rows of input data. However, when I try to predict only one row of input I get a CloudWatch error, saying that the input shape is wrong. It is driving me crazy. Mind you, and probably this is important, in my script I only have a model_fn function. I have not yet come around incorporating input_fn,predict_fn and output_fn functions. Would that be the culprit?

Problem-2:
When I try to train locally, I get the following error message

'docker-compose' is not installed. Local Mode features will not work without docker-compose. For more information on how to install 'docker-compose', please, see https://docs.docker.com/compose/install/

I never encountered that in any of the tutorials that I have been through. Is that normal?

I thank you in advance.

PS. In case you need the entire sagemaker ipynb as well as the py script just let me know. I will be grateful for the time you put in.

答案1

得分: 1

对于你的第一个问题,是的,你可以按照以下顺序手动创建模型、端点配置和端点,而不是使用 model.deploy()。SageMaker SDK 是一个更高级别的客户端,使得部署训练好的模型更容易。
然而,你可以使用 Boto3 客户端 来手动创建端点。在这里查看一个创建模型、配置和端点的示例笔记本 - https://github.com/aws/amazon-sagemaker-examples/blob/main/boto3/built-in-frameworks/scikit-learn/boto3_scikit_retrain_model_and_deploy_to_existing_endpoint/boto3_scikit_retrain_model_and_deploy_to_existing_endpoint.ipynb

问题1:我不认为 input_fn/output_fn 会引起任何问题。不知道你是否在向端点发送索引作为输入,并且这可能导致在发送多行与单行时不匹配。

问题2:你在哪里运行你的笔记本?本地模式使用 Docker 来启动一个容器以模拟训练作业,所以如果你没有安装 Docker,你会看到这个错误,并且无法使用本地模式。SageMaker 笔记本实例 预先安装了 Docker,因此如果你需要在本地模式下运行训练作业,这是一个好的替代方案。

英文:

For your first question, yes, you can manually create the model, endpoint config, endpoint in that order instead of model.deploy(). The SageMaker SDK is a higher level client that makes it easier to deploy a trained model.
However, you can use the Boto3 client to manually create the endpoint. See a sample notebook on creating model, config and endpoint here - https://github.com/aws/amazon-sagemaker-examples/blob/main/boto3/built-in-frameworks/scikit-learn/boto3_scikit_retrain_model_and_deploy_to_existing_endpoint/boto3_scikit_retrain_model_and_deploy_to_existing_endpoint.ipynb

Problem 1: I don't think the input_fn/output_fn should cause any such issue. Wonder if you're sending an index in your input to the endpoint and that's causing a mismatch when you send multiple rows vs a single row?

Problem 2: Where are you running your notebook? Local mode uses Docker to spin up a container to mimic the training job, so if you don't have Docker installed, you'll see this error and won't be able to use local mode. SageMaker notebook instances come with Docker pre-installed, so it's a good alternative to running your notebooks if you need to run training in local mode.

huangapple
  • 本文由 发表于 2023年7月27日 19:43:54
  • 转载请务必保留本文链接:https://go.coder-hub.com/76779389.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定