fasthttp websocket: panic nil: websocket.(*Conn).beginMessage()

huangapple go评论96阅读模式
英文:

fasthttp websocket: panic nil: websocket.(*Conn).beginMessage()

问题

使用 Fiber v2.38.1,它使用 fasthttp/websocket v1.5.0

应用程序崩溃并显示以下错误:

2022-10-15T04:05:42.983563+00:00 app[web.1]: time="2022-10-15T04:05:42Z" level=info msg="close msg received: &{418 close}"
2022-10-15T04:05:42.983564+00:00 app[web.1]: conn is nil: false, msg: &{418 close <nil> conn 0}
2022-10-15T04:05:42.986035+00:00 app[web.1]: panic: runtime error: invalid memory address or nil pointer dereference
2022-10-15T04:05:42.986035+00:00 app[web.1]: [signal SIGSEGV: segmentation violation code=0x1 addr=0x78 pc=0x8db967]
2022-10-15T04:05:42.986036+00:00 app[web.1]: 
2022-10-15T04:05:42.986037+00:00 app[web.1]: goroutine 86 [running]:
2022-10-15T04:05:42.986037+00:00 app[web.1]: github.com/fasthttp/websocket.(*Conn).beginMessage(0x0, 0xc000348f60, 0x1)
2022-10-15T04:05:42.986037+00:00 app[web.1]: 	/tmp/codon/tmp/cache/go-path/pkg/mod/github.com/fasthttp/websocket@v1.5.0/conn.go:479 +0x27
2022-10-15T04:05:42.986038+00:00 app[web.1]: github.com/fasthttp/websocket.(*Conn).NextWriter(0x0, 0x1)
2022-10-15T04:05:42.986038+00:00 app[web.1]: 	/tmp/codon/tmp/cache/go-path/pkg/mod/github.com/fasthttp/websocket@v1.5.0/conn.go:520 +0x45
2022-10-15T04:05:42.986039+00:00 app[web.1]: github.com/fasthttp/websocket.(*Conn).WriteJSON(0xe79a80?, {0xc143a0, 0xc00041e600})
2022-10-15T04:05:42.986039+00:00 app[web.1]: 	/tmp/codon/tmp/cache/go-path/pkg/mod/github.com/fasthttp/websocket@v1.5.0/json.go:24 +0x45
2022-10-15T04:05:42.986039+00:00 app[web.1]: github.com/kuchaguangjie/go-fit/model.(*WsClient).WriteMsg(0xc00056a280, 0xc00041e600)
2022-10-15T04:05:42.986040+00:00 app[web.1]: 	/tmp/build_0b2fe533/model/ws_model.go:97 +0xf9
2022-10-15T04:05:42.986040+00:00 app[web.1]: github.com/kuchaguangjie/go-fit/controller.(*ContentController).Ws.func1.2()
2022-10-15T04:05:42.986041+00:00 app[web.1]: 	/tmp/build_0b2fe533/controller/content_ctl.go:623 +0x171
2022-10-15T04:05:42.986041+00:00 app[web.1]: created by github.com/kuchaguangjie/go-fit/controller.(*ContentController).Ws.func1
2022-10-15T04:05:42.986042+00:00 app[web.1]: 	/tmp/build_0b2fe533/controller/content_ctl.go:608 +0x10cc
2022-10-15T04:05:43.113930+00:00 heroku[web.1]: Process exited with status 2

它在 conn.go:479 处崩溃,具体是在 beginMessage() 函数中:
> if c.writer != nil {

我已经检查过 c 不是 nil(如日志所示),所以它为什么会崩溃...


顺便说一下,它部署在 Heroku 上,我在 Heroku 上看到以下日志:
> 2022-10-15T04:59:30.344791+00:00 heroku[router]: at=error code=H15 desc="Idle connection"

不确定是否相关。


更新:相关代码

  • 处理关闭消息的单独 goroutine(content_ctl.go 的一部分):
    // 处理关闭消息 - 来自通道,
    // TODO: 使用 goroutine 池?
    go func() {
        if r := recover(); r != nil { // TODO: 这样做合适吗?
            fmt.Printf("recover from panic: %v\n", r)
        }

        closeMsg := <-(*client).CloseChan // 处理关闭消息
        log.Infof("close msg received: %v", closeMsg)

        // 发送关闭消息
        closeResp := &model.WsResp{
            Status: closeMsg.HttpCode,
            Source: model.RespSourceConn,
            Msg:    closeMsg.Msg,
        }

        log.Infof("is conn nil: %v, msg: %v\n", client.WsConn == nil, closeResp)
        err = client.WriteMsg(closeResp)
        if err != nil {
            log.Errorf("error - on write close msg: %v", closeResp)
        }

        ctl.cwm.CloseAndCleanup(client.Id) // 关闭并清理
    }()
  • WsClient 结构体的定义,以及它的 WriteMsg() 方法(ws_model.go 的一部分):
    // WebSocket 客户端
    type WsClient struct {
        WsUser
        ContentUuid string          `json:"contentUuid"` // 内容 UUID
        WsConn      *websocket.Conn `json:"-"`           // WebSocket 连接
        CloseChan   chan *CloseMsg  `json:"-"`           // 关闭通道
        WriteLock   *sync.Mutex     // 写锁,避免并发写入 WebSocket 连接,写入或关闭时获取锁
    }

    // 带锁的写入 WebSocket 消息
    func (wsc *WsClient) WriteMsg(msg *WsResp) error {
        if wsc.WsConn == nil {
            return errutil.ErrNoConn
        }
        wsc.WriteLock.Lock()
        err := wsc.WsConn.WriteJSON(msg)
        wsc.WriteLock.Unlock()
        return err
    }
  • CloseMsg 结构体(ws_model.go 的一部分):
    // 发送到关闭通道的关闭消息
    type CloseMsg struct {
        HttpCode int    `json:"httpCode"`
        Msg      string `json:"msg"`
    }

问题状态:

  • 实际上,在提问之前,我通过检查 CloseMsg 中的 httpCode 是否为 418 来解决了问题,如果是 418,则不会发送关闭消息。
  • 尽管问题已经解决,但我仍然想知道问题的实际原因,所以我提出了这个问题。

一些背景信息:

  • 对于 CloseMsgHttpCode,我有几个可能的值。
    • 418 是默认值,只有在连接发生意外的 I/O 问题时才会发生(例如,Heroku 因空闲而终止连接)。
    • 对于其他情况,例如如果我想踢出一个连接,我会发送其他的 HTTP 状态码。
  • 从日志中,我发现问题只发生在关闭消息的 HTTP 状态码为 418 时。
  • 顺便说一下,虽然 418 可能不是最好的 HTTP 状态码(它是一个笑话),但我认为这不会影响问题。
  • WsClientWriteMsg() 方法中,我一直在发送消息之前检查连接是否为 nil。

更新 - 来自 Heroku 的更多日志

2022-10-16T05:19:03.088622+00:00 heroku[router]: at=error code=H15 desc="Idle connection" method=GET path="/api/v1/content/ws/RgPx8znal7AJb
2022-10-16T05:19:03.065493+00:00 app[web.1]: time="2022-10-16T05:19:03Z" level=info msg="error - on read:  websocket: close 1006 (abnormal closure): unexpected EOF"
2022-10-16T05:20:57.758379+00:00 app[web.1]: time="2022-10-16T05:20:57Z" level=info msg="cleanup client, id = 'wscid-ee670cc5-4100-49d6-9857-8284d93a6d33'"
2022-10-16T05:20:57.758505+00:00 app[web.1]: time="2022-10-16T05:20:57Z" level=info msg="close msg received: &{418 close}"

看起来 Heroku 会在一段时间内没有消息时关闭 WebSocket 连接。这就是 418 出现的原因。

另外,我在 SO 上提了一个关于 Heroku 终止 WebSocket 连接的问题:https://stackoverflow.com/questions/74084808/heroku-kill-idle-connection-websocket

英文:

Using Fiber v2.38.1, which in turn use fasthttp/websocket v1.5.0.

App crashed with following error:

2022-10-15T04:05:42.983563+00:00 app[web.1]: time=&quot;2022-10-15T04:05:42Z&quot; level=info msg=&quot;close msg received: &amp;{418 close}&quot;
2022-10-15T04:05:42.983564+00:00 app[web.1]: conn is nil: false, msg: &amp;{418 close &lt;nil&gt; conn 0}
2022-10-15T04:05:42.986035+00:00 app[web.1]: panic: runtime error: invalid memory address or nil pointer dereference
2022-10-15T04:05:42.986035+00:00 app[web.1]: [signal SIGSEGV: segmentation violation code=0x1 addr=0x78 pc=0x8db967]
2022-10-15T04:05:42.986036+00:00 app[web.1]: 
2022-10-15T04:05:42.986037+00:00 app[web.1]: goroutine 86 [running]:
2022-10-15T04:05:42.986037+00:00 app[web.1]: github.com/fasthttp/websocket.(*Conn).beginMessage(0x0, 0xc000348f60, 0x1)
2022-10-15T04:05:42.986037+00:00 app[web.1]: 	/tmp/codon/tmp/cache/go-path/pkg/mod/github.com/fasthttp/websocket@v1.5.0/conn.go:479 +0x27
2022-10-15T04:05:42.986038+00:00 app[web.1]: github.com/fasthttp/websocket.(*Conn).NextWriter(0x0, 0x1)
2022-10-15T04:05:42.986038+00:00 app[web.1]: 	/tmp/codon/tmp/cache/go-path/pkg/mod/github.com/fasthttp/websocket@v1.5.0/conn.go:520 +0x45
2022-10-15T04:05:42.986039+00:00 app[web.1]: github.com/fasthttp/websocket.(*Conn).WriteJSON(0xe79a80?, {0xc143a0, 0xc00041e600})
2022-10-15T04:05:42.986039+00:00 app[web.1]: 	/tmp/codon/tmp/cache/go-path/pkg/mod/github.com/fasthttp/websocket@v1.5.0/json.go:24 +0x45
2022-10-15T04:05:42.986039+00:00 app[web.1]: github.com/kuchaguangjie/go-fit/model.(*WsClient).WriteMsg(0xc00056a280, 0xc00041e600)
2022-10-15T04:05:42.986040+00:00 app[web.1]: 	/tmp/build_0b2fe533/model/ws_model.go:97 +0xf9
2022-10-15T04:05:42.986040+00:00 app[web.1]: github.com/kuchaguangjie/go-fit/controller.(*ContentController).Ws.func1.2()
2022-10-15T04:05:42.986041+00:00 app[web.1]: 	/tmp/build_0b2fe533/controller/content_ctl.go:623 +0x171
2022-10-15T04:05:42.986041+00:00 app[web.1]: created by github.com/kuchaguangjie/go-fit/controller.(*ContentController).Ws.func1
2022-10-15T04:05:42.986042+00:00 app[web.1]: 	/tmp/build_0b2fe533/controller/content_ctl.go:608 +0x10cc
2022-10-15T04:05:43.113930+00:00 heroku[web.1]: Process exited with status 2

It crashed at conn.go:479, in function beginMessage():
> if c.writer != nil {

I've checked that c is not nil (as shown in log), so how it crashed ...


BTW, it's deployed on heroku, I saw following log on heroku:
> 2022-10-15T04:59:30.344791+00:00 heroku[router]: at=error code=H15 desc="Idle connection"

Not sure is it relevent.


Update: relevant code

  • A separate goroutine that handle close msg (part of content_ctl.go):
    // handle close - from a channel,
    // TODO: use goroutine pool ?
    go func() {
        if r := recover(); r != nil { // TODO: is this proper ?,
            fmt.Printf(&quot;recover from panic: %v\n&quot;, r)
        }

        closeMsg := &lt;-(*client).CloseChan // handle close,
        log.Infof(&quot;close msg received: %v&quot;, closeMsg)

        // send close msg,
        closeResp := &amp;model.WsResp{
            Status: closeMsg.HttpCode,
            Source: model.RespSourceConn,
            Msg:    closeMsg.Msg,
        }

        log.Infof(&quot;is conn nil: %v, msg: %v\n&quot;, client.WsConn == nil, closeResp)
        err = client.WriteMsg(closeResp)
        if err != nil {
            log.Errorf(&quot;error - on write close msg: %v&quot;, closeResp)
        }

        ctl.cwm.CloseAndCleanup(client.Id) // close &amp; clean up,
    }()
  • Definition of WsClient struct, and its method WriteMsg() (part of ws_model.go):
    // websocket client,
    type WsClient struct {
        WsUser
        ContentUuid string          `json:&quot;contentUuid&quot;` // content uuid id,
        WsConn      *websocket.Conn `json:&quot;-&quot;`           // websocket connection,
        CloseChan   chan *CloseMsg  `json:&quot;-&quot;`           // close channel,
        WriteLock   *sync.Mutex     // write lock, to avoid concurrent write to ws connection, which will cause panic, acquire lock when write or close,
    }

    // write ws msg, with lock,
    func (wsc *WsClient) WriteMsg(msg *WsResp) error {
        if wsc.WsConn == nil {
            return errutil.ErrNoConn
        }
        wsc.WriteLock.Lock()
        err := wsc.WsConn.WriteJSON(msg)
        wsc.WriteLock.Unlock()
        return err
    }

  • CloseMsg struct (part of ws_model.go):
    // close msg sent to close chan,
    type CloseMsg struct {
        HttpCode int    `json:&quot;httpCode&quot;`
        Msg      string `json:&quot;msg&quot;`
    }

Problem status:

  • Actually, before asking the question I've solved the problem by checking the httpCode in the CloseMsg, if it's 418, then I won't send the close msg.
  • Though the issue is gone for now, but still I want to know the actual reason of the problem, so I asked the question.

Some background:

  • I've several possible value for CloseMsg's HttpCode.
    • 418 is the default one, it happens only when unexpected io issue happen on the connection (e.g heroku terminate the connection due to idle ?).
    • For other cases, e.g if I want to kick out a connection, then I will send other http code.
  • From the log, I found the problem happens only when the http code from close msg is 418.
  • BTW, though 418 might not be the best http code to use, (it's from a joke), but that doesn't effect the problem I guess.
  • And, In WsClient's WriteMsg() method, I've always checked whether the conn nil, before send msg.

Update - more log from heroku

2022-10-16T05:19:03.088622+00:00 heroku[router]: at=error code=H15 desc=&quot;Idle connection&quot; method=GET path=&quot;/api/v1/content/ws/RgPx8znal7AJb
2022-10-16T05:19:03.065493+00:00 app[web.1]: time=&quot;2022-10-16T05:19:03Z&quot; level=info msg=&quot;error - on read:  websocket: close 1006 (abnormal closure): unexpected EOF&quot;
2022-10-16T05:20:57.758379+00:00 app[web.1]: time=&quot;2022-10-16T05:20:57Z&quot; level=info msg=&quot;cleanup client, id = &#39;wscid-ee670cc5-4100-49d6-9857-8284d93a6d33&#39;&quot;
2022-10-16T05:20:57.758505+00:00 app[web.1]: time=&quot;2022-10-16T05:20:57Z&quot; level=info msg=&quot;close msg received: &amp;{418 close}&quot;

Seems heroku will kill the websocket connection if there is no msg for a while. That's when 418 appear.

And, I've asked a question about heroku killing the websocket connection on SO: https://stackoverflow.com/questions/74084808/heroku-kill-idle-connection-websocket

答案1

得分: 1

这是一个fasthttp的bug,在v1.5.1-rc.8版本中已修复。

参考链接:https://github.com/fasthttp/websocket/issues/30#issuecomment-1326860975

英文:

This is a fasthttp bug, and fixed in v1.5.1-rc.8

refer: https://github.com/fasthttp/websocket/issues/30#issuecomment-1326860975

huangapple
  • 本文由 发表于 2022年10月15日 12:27:21
  • 转载请务必保留本文链接:https://go.coder-hub.com/74076674.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定