处理通过TCP接收到的碎片化数据的高效方法

huangapple go评论79阅读模式
英文:

Go: efficiently handling fragmented data received via TCP

问题

我正在编写一个小型的TCP服务器,只需要读取数据,解析接收到的数据(动态长度的帧),并处理这些帧。考虑以下代码:

func ClientHandler(conn net.Conn) {
    buf := make([]byte, 4096)
    n, err := conn.Read(buf)
    if err != nil {
        fmt.Println("Error reading:", err.Error())
        return
    }
    fmt.Println("received", n, "bytes of data =", string(buf))
    handleData(buf)
}

这基本上是我的代码的要点。我将X个字节读入一个空缓冲区,然后处理数据。

问题出现在以下情况下:

  1. 数据帧的大小大于尝试从TCP连接中读取的缓冲区大小。在这种情况下,我无法处理数据,直到接收到剩余的数据(但接收缓冲区已经被前面的数据占用)。
  2. 当缓冲区包含一个半帧的数据,并且我需要处理第一帧时,同时保留不完整的帧以供以后使用...在这种情况下,我需要从缓冲区中删除已处理的部分,并移动不完整的部分,以便为帧的剩余部分腾出空间。

这两种情况可能需要重新分配和复制数据,这可能是一个昂贵的操作。此外,我不知道如何处理大于缓冲区的帧,除了扩展缓冲区...但是不断增加的缓冲区可能会导致性能问题和拒绝服务。最后,但并非最不重要的是,我对Golang的标准库了解不够,不知道是否有任何专门用于处理这些情况的包。

所以我的问题是:

  1. 是否有处理这些情况的最佳实践?
  2. 是否有任何Golang包可以为我完成部分或大部分工作?

谢谢。

英文:

I am writing a small tcp-server that is only suppose to read data, parse the receiving data (dynamic length frames) and handling these frames. Consider the following code:

func ClientHandler(conn net.Conn) {
    buf := make([]byte, 4096)
    n, err := conn.Read(buf)
    if err != nil {
        fmt.Println("Error reading:", err.Error())
        return
    }
    fmt.Println("received ", n, " bytes of data =", string(buf))
    handleData(buf)

This is pretty much the essence of my code as is. I read X bytes into a empty buffer and then handle the data.

The problem occurs when:

  1. a frame of data is bigger than the buffer it is trying to read into from the tcp connection. In this case I cannot handle the data until I receive the rest (but then the receving buffer is already occupied by the previous data).
  2. when the buffer contains one and a half frame of data and I need to handle the first frame, while keeping the incomplete frame for later... in this case I need to remove the handled part from the buffer, and move the incomplete part so that there is room for the remainder of the frame.

Both scenarios would probably require reallocation and copying of data, which may perhaps be a costly operation? Furthermore, I have no ideas on how to handle frames that are larger than the buffer except for expanding the buffer... but an ever-increasing buffer may perhaps lead to performance issues and denial of service. Last, but not least, I do not know the standard library of Golang good enough to know if there are any package built explicitly for handling these situations.

So my questions are:

  1. is there any best practice for handling these situations?
  2. is there any golang package that will do some or most of this for me?

Thank you.

答案1

得分: 6

字节切片应该支持相当优化的调整大小(即保留比所需更多的字节,如果可以的话不进行复制,指数增长,复制代码不是用Go编写的,而是作为运行时的一部分等)。所以你可以使用append并查看它的工作原理。

另一种更符合惯用法的方法是使用bufio.Reader来包装连接,它会自动处理所有这些。我在我编写的一个TCP服务器中使用过它,速度非常快。你只需要这样做:

r := bufio.NewReader(conn)

然后,你可以根据分隔符或预先知道的字节数读取。

英文:

byte slices should support pretty optimized resizing (i.e. reserving more bytes than needed, not copying if they can, growing exponentially, the copying code is not written in go but part of the runtime, etc).

so you can use append and see how it works.

Another, more idiomatic approach, is using bufio.Reader to wrap the connection, and it will handle all this automatically. I've used it in a tcp server I've written and it was extremely fast. You just do:

r := bufio.NewReader(conn)

and then you can either read until a delimiter or a given amount of bytes if you know it in advance.

huangapple
  • 本文由 发表于 2014年3月6日 16:22:23
  • 转载请务必保留本文链接:https://go.coder-hub.com/22218838.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定