通过步骤函数调用红移存储过程不等待完成并触发下一个作业。

huangapple go评论62阅读模式
英文:

Calling a redshift stored procedure through step function does not wait to finish and triggers next job

问题

当我通过AWS步骤函数调用Redshift查询时,它不等待此步骤完成并触发下一个作业。

我尝试过检查等待回调选项,但Redshift步骤会继续运行,作业无法完成。我需要手动终止它。

英文:

When I am calling redshift query through AWS step function it does not wait for this step to finish and triggers next job.

I tried checking wait for callback option but then redshift step keeps on running and the job does not complete. I need to manually kill it.

答案1

得分: 0

.waitForCallback集成模式会导致Step Functions执行任务中描述的API调用,然后等待您使用SendTaskSuccessSendTaskFailureAPI操作回调到Step Functions,其中包括您与API调用一起发送的任务令牌。对于arn:aws:states:::aws-sdk:redshiftdata:executeStatement,我认为没有可行的方法使其工作,因此它不适合您的用例。

对于某些优化的服务集成,我们支持运行作业(.sync)集成模式,但我们没有针对Redshift数据的此类集成。

对于这些情况,您有一个实现异步交互的API,最佳解决方案是像下面示例中所示的轮询循环。这将运行查询,尝试加载结果,然后如果结果尚未准备好,等待5秒,然后再次尝试。

您可以将此循环内置到您的每个工作流中。或者,如果您认为这是您将要在许多工作流中重复使用的内容,您可以创建一个专用的状态机来封装该操作,然后使用Step Functions的优化服务集成从任何状态机中调用它,该集成支持.sync。

通过步骤函数调用红移存储过程不等待完成并触发下一个作业。

{
  "StartAt": "ExecuteStatement",
  "States": {
    "ExecuteStatement": {
      "Type": "Task",
      "Parameters": {
        "ClusterIdentifier": "example-cluster",
        "Database": "sample_data_dev",
        "Sql": "SELECT sum(qtysold) FROM tickit.sales, tickit.date WHERE sales.dateid = date.dateid AND caldate = '2008-01-05'"
      },
      "Resource": "arn:aws:states:::aws-sdk:redshiftdata:executeStatement",
      "Next": "GetStatementResult"
    },
    "GetStatementResult": {
      "Type": "Task",
      "Parameters": {
        "Id.$": "$.Id"
      },
      "Resource": "arn:aws:states:::aws-sdk:redshiftdata:getStatementResult",
      "Next": "Has the query finished?",
      "Catch": [
        {
          "ErrorEquals": [
            "RedshiftData.ResourceNotFoundException"
          ],
          "Next": "Wait to let the query complete",
          "ResultPath": "$.getStatementResultError"
        }
      ]
    },
    "Has the query finished?": {
      "Type": "Choice",
      "Choices": [
        {
          "And": [
            {
              "Variable": "$.ColumnMetadata",
              "IsPresent": true
            }
          ],
          "Next": "Success"
        }
      ],
      "Default": "Wait to let the query complete"
    },
    "Success": {
      "Type": "Succeed"
    },
    "Wait to let the query complete": {
      "Type": "Wait",
      "Seconds": 5,
      "Next": "GetStatementResult"
    }
  }
}
英文:

The .waitForCallback integration pattern will cause Step Functions to make the API call described in the task, then wait for you to call back to Step Functions using the SendTaskSuccess or SendTaskFailure API Actions, including a Task Token that you sent along with the API call. In the case of arn:aws:states:::aws-sdk:redshiftdata:executeStatement, I don't think there's a feasible way to make that work, so it's not a fit for your use case.

For certain of our Optimized Service Integrations, we support the Run a Job (.sync) integration pattern, but we do not have such an integration for Redshift data.

For these scenarios, where you have an API that implements an asynchronous interaction, the best solution is a polling loop like the one shown below. This will run the query, try to load the result, then wait 5 seconds if it's not ready and then try again.

You can build this loop into each of your workflows. Or if this is something you think you will want to re-use in many workflows, you can create a specific state machine to encapsulate the action, then call it from any of your state machines using the Optimized Service Integration for Step Functions, which does support .sync.

通过步骤函数调用红移存储过程不等待完成并触发下一个作业。

{
  "StartAt": "ExecuteStatement",
  "States": {
    "ExecuteStatement": {
      "Type": "Task",
      "Parameters": {
        "ClusterIdentifier": "example-cluster",
        "Database": "sample_data_dev",
        "Sql": "SELECT sum(qtysold) FROM tickit.sales, tickit.date WHERE sales.dateid = date.dateid AND caldate = '2008-01-05'"
      },
      "Resource": "arn:aws:states:::aws-sdk:redshiftdata:executeStatement",
      "Next": "GetStatementResult"
    },
    "GetStatementResult": {
      "Type": "Task",
      "Parameters": {
        "Id.$": "$.Id"
      },
      "Resource": "arn:aws:states:::aws-sdk:redshiftdata:getStatementResult",
      "Next": "Has the query finished?",
      "Catch": [
        {
          "ErrorEquals": [
            "RedshiftData.ResourceNotFoundException"
          ],
          "Next": "Wait to let the query complete",
          "ResultPath": "$.getStatementResultError"
        }
      ]
    },
    "Has the query finished?": {
      "Type": "Choice",
      "Choices": [
        {
          "And": [
            {
              "Variable": "$.ColumnMetadata",
              "IsPresent": true
            }
          ],
          "Next": "Success"
        }
      ],
      "Default": "Wait to let the query complete"
    },
    "Success": {
      "Type": "Succeed"
    },
    "Wait to let the query complete": {
      "Type": "Wait",
      "Seconds": 5,
      "Next": "GetStatementResult"
    }
  }
}

huangapple
  • 本文由 发表于 2023年5月25日 05:48:04
  • 转载请务必保留本文链接:https://go.coder-hub.com/76327612.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定