如何在Go中通过套接字检索文件数据?

huangapple go评论68阅读模式
英文:

How do I retrieve file data over a socket in Go?

问题

我有两个小程序通过套接字进行良好通信,其中接收方是Go。当我的消息足够小以适应1024字节缓冲区并且可以在单个连接的读取中接收时,一切都很顺利,但现在我想传输100k+或更多的图像数据。我假设正确的解决方案不是增加缓冲区,直到任何图像都能适应其中。

伪代码如下:

var buf = make([]byte,1024)
conn, err := net.Dial("tcp", ":1234")

for {
    r, err := conn.Read(buf[0:])
    go readHandler(string(buf[0:r]),conn)
}

如何改进我的套接字读取例程以接受几个字节的简单消息和较大的数据?如果您可以将总图像数据转换为io.Reader以供image.Decode使用,将获得额外的奖励分数。

英文:

I've got two small programs communicating nicely over a socket where the receiving side is in Go. Everything works peachy when my messages are tiny enough to fit in the 1024 byte buffer and can be received in a single Read from the connection but now I want to transfer data from an image that is 100k+ or more. I'm assuming the correct solution is not to increase the buffer until any image can fit inside.

Pseudo-go:

var buf = make([]byte,1024)
conn, err := net.Dial("tcp", ":1234")

for {
    r, err := conn.Read(buf[0:])
    go readHandler(string(buf[0:r]),conn)
}

How can I improve my socket read routine to accept both simple messages of a few bytes and also larger data? Bonus points if you can turn the total image data into an io.Reader for use in image.Decode.

答案1

得分: 3

我对Go中的TCP没有直接的经验,但在我看来,你似乎成为了一个典型的误解TCP提供的保证的受害者。

问题是,与UDP和SCTP(流控制传输协议)相比,TCP没有消息边界的概念,因为它是面向流的。这意味着TCP传输的是不透明的字节流,你对于在接收端如何“分块”这个流几乎没有控制权。

我怀疑你所观察到的“发送一个100k+的消息”实际上是运行时/网络库在发送端通常通过将你的“消息”消耗到其内部缓冲区中,然后以操作系统的TCP堆栈允许的任意大小的块流式传输它(在普遍的硬件/软件上通常是8k)。接收方获取该流的片段大小是完全不确定的;唯一确定的是流中的字节的顺序是保留的。

因此,你可能需要重新考虑接收数据的方法。具体的方法取决于正在流式传输的数据的性质:

  • 最简单的方法(如果你对应用层协议有控制权)是在固定格式的特殊长度字段中传递后续“消息有效负载”的长度。然后,解流整个消息是一个两步过程:1)接收那么多字节以获取长度字段,读取它,检查值是否合理,然后2)读取那么多后续字节并完成。

  • 如果你对应用层协议没有控制权,解析消息会变得更加复杂,通常需要一些复杂的状态机。

更多信息,请参考这个和这个。

英文:

I have no direct experience with TCP in Go but to me it seems that you fell victim of a quite typical misunderstanding of what guarntees TCP offers.

The thing is, in contrast with, say, UDP and SCTP, TCP does not have the concept of message boundaries because it's stream-oriented. It means, TCP transports opaque streams of bytes and you have very little control of "chunking" that stream with regard to the receiving side.

I suspect what you observe as "sending a 100k+ message" is the runtime/network library on the sender side typically "deceiving" you by consuming your "message" into its internal buffers and then streaming it in whatever chunks OS's TCP stack allows it to (on ubiquitous hardware/software it's usually about 8k). The size of pieces the receiver gets that stream is completely undefined; the only thing defined is ordering of the bytes in the stream, which is preserved.

Hence it might turn out you have to resonsider your approach to receiving data. The exact approach varies depending on the nature of the data being streamed:

  • The easiest way (if you have the control over the application-level protocol) is to pass the length of the following "message payload" in a special length field of fixed format. Then destreaming the whole message is a two-step process: 1) receive that many bytes to get the length field, read it, check the value for sanity, then 2) read that many following bytes and be done with it.
  • If you have no control over the app-level protocol, parsing messages becomes more involved and usually requires some sort of complicated state machine.

For more info, look at this and this.

答案2

得分: 2

你可以使用io.ReadFull来读取一个特定长度的[]byte。这假设你事先知道需要读取多少字节。

至于image.Decode,应该可以直接将conn传递给image.Decode函数。这假设在图像解码之前不会从连接中进行任何读取操作。

你的代码

for {
    r, err := conn.Read(buf[0:])
    go readHandler(string(buf[0:r]),conn)
}

似乎暗示你启动的goroutine正在从conn中读取。这似乎不是一个好主意,因为你最终会有多个并发从连接中读取(无法控制读取的顺序):一个在for循环中,另一个在readHandler中。

英文:

You can use <code>io.ReadFull</code> to read a <code>[]byte</code> of a specific length. This assumes that you know beforehand how many bytes you need to read.

As for <code>image.Decode</code>, it should be possible to pass the <code>conn</code> directly to the <code>image.Decode</code> function. This assumes that you do not perform any reads from the connection until the image is decoded.

Your code

for {
    r, err := conn.Read(buf[0:])
    go readHandler(string(buf[0:r]),conn)
}

seems to be suggesting that the goroutine you are starting is reading from <code>conn</code> This doesn't seem like a good idea, because you will end up having multiple concurrent reads from the connection (without having control over the order in which the reads will happen): one in the for-loop, another one in <code>readHandler</code>.

huangapple
  • 本文由 发表于 2011年10月16日 21:53:23
  • 转载请务必保留本文链接:https://go.coder-hub.com/7784700.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定