AWS Step Function error handling for Go Lambda

huangapple go评论91阅读模式
英文:

AWS Step Function error handling for Go Lambda

问题

我无法找到关于如何在Step函数中定义错误条件匹配器的详细说明,该匹配器基于Go处理程序返回的错误。

handler是一个普通的Go函数,如果从上游服务收到503,则返回一个error

func HandleHotelBookingRequest(ctx context.Context, booking HotelBookingRequest) (
	confirmation HotelBookingResponse, err error) {
    
    ...
		if statusCode == http.StatusServiceUnavailable {
			err = errors.New("TransientError")
		} else {

我可以控制函数的返回值以及字符串的格式化方式;但是我找不到关于在这里使用什么(或者在Catch子句中使用什么)的实际信息,以便与上述内容匹配:

      "Retry": [
        {
          "ErrorEquals": [
            "TransientError"
          ],
          "BackoffRate": 1,
          "IntervalSeconds": 1,
          "MaxAttempts": 3,
          "Comment": "Retry for Transient Errors (503)"
        }
      ]

当我在控制台中测试Lambda时,当上游服务返回503时,我得到了以下结果(如预期):

{
  "errorMessage": "TransientError",
  "errorType": "errorString"
}

我有一种明确的印象(但不确定如何验证),如果我更改为:

          "ErrorEquals": [
            "errorString"
          ],

Retry会起作用(至少从CloudWatch日志中看,我可以看到被记录的transient error,但是Step函数最终成功执行)。

我找不到太多关于此的文档,但是:

  1. 是否可以匹配实际的错误消息(我看到API Gateway允许使用正则表达式进行匹配)?
  2. 如果不可能,是否应该返回不同的“错误类型”,而不是error

提前感谢!

英文:

I cannot find a detailed explanation of how to define the error condition matcher in the Step function, based on the error returned by the Go handler.

The handler is a bog-standard Go function, returns an error if it gets a 503 from an upstream service:

func HandleHotelBookingRequest(ctx context.Context, booking HotelBookingRequest) (
	confirmation HotelBookingResponse, err error) {
    
    ...
		if statusCode == http.StatusServiceUnavailable {
			err = errors.New("TransientError")
		} else {

I can control what the function returns, and how it formats the string; I cannot find any real information about what to use here (or in a Catch clause, for that matter), so tht this matches the above:

      "Retry": [
        {
          "ErrorEquals": [
            "TransientError"
          ],
          "BackoffRate": 1,
          "IntervalSeconds": 1,
          "MaxAttempts": 3,
          "Comment": "Retry for Transient Errors (503)"
        }
      ]

When I test the Lambda in the Console, this is what I get (as expected) when the upstream returns a 503:

{
  "errorMessage": "TransientError",
  "errorType": "errorString"
}

And I have the distinct impression (but not quite sure how to validate this) that if I change to:

          "ErrorEquals": [
            "errorString"
          ],

the Retry works (at least, looking at the CloudWatch logs, I can see the transient errors being logged, but the Step function eventually succeeds).

I cannot find much documentation on this but:

  1. would it be possible to match on the actual error message (I saw that the API Gateway allows to do that, using a RegEx);
  2. if that's not possible, should I return a different "error type", instead of error

Thanks in advance!

答案1

得分: 5

最终解决了这个谜题;最后,它其实很简单,与JavaScript的方法非常相似(这个方法给了我提示,并且在示例中有广泛的文档记录);然而,由于我无法在任何地方找到针对Go的特定答案(在AWS的广泛、良好、详细的文档、Google、这里),所以我在这里发布它以供将来参考。

TL;DR - 定义自己的error接口实现,并返回该类型的对象,而不是常规的fmt.Error(),然后在ErrorEquals子句中使用类型名称。

一个非常基本的示例实现在这个gist中显示。

为了测试这个,我创建了一个ErrorStateMachine(JSON定义在同一个gist中),并根据ErrorEquals类型选择了一个不同的捕捉器:

        {
          "ErrorEquals": [
            "HandlerError"
          ],
          "Next": "Handler Error"
        }

使用不同的Outcome输入测试Step Function,会选择不同的路径。

我猜我被绊倒的是,当涉及到Go时,我是一个相对初学者,我没有意识到errorString是由errors.New()方法返回的error接口的实际类型,而这个方法在fmt.Errorf()中使用:

// in errors/errors.go

// errorString is a trivial implementation of error.
type errorString struct {
	s string
}

我天真地以为这只是AWS命名的东西。

一个有趣的转折(实际上并不理想)是,实际的错误消息被“包装”在Step函数的输出中,并且在后续步骤中解析可能有点麻烦:

{
  "Error": "HandlerError",
  "Cause": "{\"errorMessage\":\"error from a failed handler\",\"errorType\":\"HandlerError\"}"
}

对于将来的开发者来说,将实际的错误消息(由Error()生成)直接发出到Cause字段中肯定会更加友好。

希望其他人会发现这个有用,不会像我一样浪费时间在这上面。

英文:

Finally solved the riddle; in the end, it was trivial and fairly identical to the JavaScript approach (which (a) gave me the hint and (b) is widely documented in examples); however, as I was unable to find a Go-specific answer anywhere (in AWS -expansive, good, detailed- documentation, Google, here) I am posting it here for future reference.

TL;DR - define your own implementation of the error interface and return an object of that type, instead of the bog-standard fmt.Error(), then use the type name in the ErrorEquals clause.

A very basic example implementation is shown in this gist.

To test this, I have created an ErrorStateMachine (JSON definition in the same gist) and selected a different catcher based on the ErrorEquals type:

        {
          "ErrorEquals": [
            "HandlerError"
          ],
          "Next": "Handler Error"
        }

Testing the Step Function with different Outcome inputs, causes different paths to be chosen.

What I guess tripped me off was that I am a relative beginner when it comes to Go and I hadn't realized that errorString is the actual type of the error interface returned by the errors.New() method, which is used inside fmt.Errorf():

// in errors/errors.go

// errorString is a trivial implementation of error.
type errorString struct {
	s string
}

I had naively assumed that this was just something that AWS named.

An interesting twist (which is not really ideal) is that the actual error message is "wrapped" in the Step function output and may be a bit cumbersome to parse in subsequent steps:

{
  "Error": "HandlerError",
  "Cause": "{\"errorMessage\":\"error from a failed handler\",\"errorType\":\"HandlerError\"}"
}

It would have certainly been a lot more developer-friendly to have the actual error message (generated by Error()) to be emitted straight into the Cause field.

Hope others find this useful and won't have to waste time on this like I did.

huangapple
  • 本文由 发表于 2021年7月28日 10:45:52
  • 转载请务必保留本文链接:https://go.coder-hub.com/68553764.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定