英文:
How to monitor progress of "startpipelineexecution" for sagemaker on stepfunctions?
问题
我目前有一个使用Step Functions执行的SageMaker管道。然而,虽然我能够启动执行,但我无法让步骤在继续下一步之前等待。因此,我应该如何在Step Functions中设置它,以便在执行完整个管道之前等待执行下一步?
英文:
I currently have a sagemaker pipeline that is executed using step functions. However, while I am able to start the execution, I am unable to allow the step to wait before moving on to the next step. Hence I should I set it up within step function such that it waits for the pipeline to be executed completely before executing the next step?
答案1
得分: 0
StartPipelineExecution
是一个异步 API,立即返回执行 ARN。您有多种选项可以等待管道完成,使用 Step Function。以下是两种选项:
选项1: 轮询
一种选择是在单个状态机内轮询执行完成。轮询可以通过一个 Wait - Lambda - Choice 任务循环来实现,或者通过一个带有 回调模式 的单个 Lambda 函数(在这种情况下,循环是 Lambda 的职责)。在这两种情况下,Lambda 使用 SDK 调用检查管道执行的状态。
可视化地,状态机如下所示:
SM #1
[x x x S P x x x x]
其中 S
= Sagemaker StartPieplineExecution
任务,P
= 轮询任务,x
= 其他任务。
选项2: 事件驱动
第二个选项避免了轮询。相反,将状态机分为两部分,第一部分以 StartPieplineExecution
任务结束。添加一个 EventBridge 规则,当发出具有 "currentPipelineExecutionStatus": "Succeeded"
的 Pipeline execution state change 事件时,触发第二半部分的任务。
SM #1 SM #2
[x x x S] -> 管道成功事件 -> [x x x x]
这些模式通常适用于协调异步任务。参见这个 相关问题 以获取另一个示例。
英文:
StartPipelineExecution
is an asynchronous API that returns immediately with the execution ARN. You have several options to wait for pipeline completion with a Step Function. Here are two:
Option 1: Polling
One option is to poll for execution completion within a single State Machine. Polling could be implemented with a Wait - Lambda - Choice task loop. Or by a single Lambda function with a callback pattern (in which case the looping is the Lambda's job). In both cases, the Lambda checks the status of the pipeline execution with an SDK call.
Visually, the State Machine looks like this:
SM #1
[x x x S P x x x x]
where S
= Sagemaker StartPieplineExecution
task, P
= Poller tasks, x
= Other tasks.
Option #2: Event-driven
A second option avoids polling. Instead, split your State Machine in two, the first one ending with the StartPieplineExecution
task. Add an EventBridge rule that triggers the second half of your tasks when a Pipeline execution state change event with "currentPipelineExecutionStatus": "Succeeded"
is emitted.
SM #1 SM #2
[x x x S] -> pipeline success event -> [x x x x]
These patterns apply more generally to orchestratiing asynchronous tasks. See this related question for another example.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论