英文:
How do you read without specifying the length of a byte slice beforehand, with the net.TCPConn in golang?
问题
我正在尝试使用Redis客户端(一个只运行redis-cli的终端)从TCP连接中读取一些消息。然而,net包的Read命令要求我提供一个切片作为参数。每当我给出一个长度为零的切片时,连接就会崩溃,Go程序停止运行。我不确定事先需要多长的字节消息。所以,除非我指定一个非常大的切片,否则这个连接将始终关闭,尽管这似乎是浪费的。我想知道,是否有可能在不事先知道消息长度的情况下保持连接?我希望能够解决我的具体问题,但我觉得这个问题更为普遍。为什么我需要事先知道长度?难道库不能给我一个正确大小的切片吗?
还有其他什么解决方案可以建议吗?
英文:
I was trying to read some messages from a tcp connection with a redis client (a terminal just running redis-cli). However, the Read command for the net package requires me to give in a slice as an argument. Whenever I give a slice with no length, the connection crashes and the go program halts. I am not sure what length my byte messages need going to be before hand. So unless I specify some slice that is ridiculously large, this connection will always close, though this seems wasteful. I was wondering, is it possible to keep a connection without having to know the length of the message before hand? I would love a solution to my specific problem, but I feel that this question is more general. Why do I need to know the length before hand? Can't the library just give me a slice of the correct size?
Or what other solution do people suggest?
答案1
得分: 7
不知道消息大小的确是你必须指定读取大小的原因(这不仅适用于Go,也适用于任何网络库)。TCP是一种流协议。就TCP协议而言,消息会一直持续到连接关闭。
如果你知道你要一直读取到文件结束符(EOF),可以使用ioutil.ReadAll
函数。
调用Read
函数不能保证你得到的是你期望的所有内容。它可能返回较少的数据,也可能返回更多的数据,这取决于你接收到了多少数据。进行IO操作的库通常通过一个“缓冲区”来读取和写入数据;你可以有一个“读缓冲区”,它是一个预先分配的字节切片(通常为32k),每次你想从网络中读取数据时都会重复使用该切片。这就是为什么IO函数会返回字节数的原因,这样你就知道上一次操作填充了缓冲区的多少字节。如果缓冲区已满,或者你仍然期望更多的数据,你只需再次调用Read
函数。
英文:
Not knowing the message size is precisely the reason you must specify the Read size (this goes for any networking library, not just Go). TCP is a stream protocol. As far as the TCP protocol is concerned, the message continues until the connection is closed.
If you know you're going to read until EOF, use ioutil.ReadAll
Calling Read
isn't guaranteed to get you everything you're expecting. It may return less, it may return more, depending on how much data you've received. Libraries that do IO typically read and write though a "buffer"; you would have your "read buffer", which is a pre-allocated slice of bytes (up to 32k is common), and you re-use that slice each time you want to read from the network. This is why IO functions return number of bytes, so you know how much of the buffer was filled by the last operation. If the buffer was filled, or you're still expecting more data, you just call Read
again.
答案2
得分: 1
有点晚了,但是...
- 其中一个问题是如何确定消息的大小。JimB给出的答案是TCP是一种流式协议,所以没有真正的结束。
- 我认为这个答案是错误的。TCP将比特流分成连续的数据包。每个数据包都有一个IP头和一个TCP头,可以参考Wikipedia和这里。每个数据包的IP头中包含一个字段,表示该数据包的长度。你需要进行一些数学计算,减去TCP头的长度,得到实际的数据长度。
此外,最大消息长度可以在TCP头中指定。 - 因此,你可以提供一个足够长度的缓冲区进行读取操作。但是,你必须先读取数据包头信息。如果最大消息大小超过你愿意接受的长度,你可能不应该接受TCP连接。
- 通常,发送方会使用fin数据包终止连接(参见1),而不是使用EOF字符。
- 在读取操作中,EOF很可能表示一个数据包在规定的时间内没有完全传输。
英文:
A bit late but...
- One of the questions was how to determine the message size. The answer given by JimB was that TCP is a streaming protocol, so there is no real end.
- I believe this answer is incorrect. TCP divides up a bitstream into sequential packets. Each packet has an IP header and a TCP header See Wikipedia and here. The IP header of each packet contains a field for the length of that packet. You would have to do some math to subtract out the TCP header length to arrive at the actual data length.
In addition, the maximum length of a message can be specified in the TCP header. - Thus you can provide a buffer of sufficient length for your read operation. However, you have to read the packet header information first. You probably should not accept a TCP connection if the max message size is longer than you are willing to accept.
- Normally the sender would terminate the connection with a fin packet (see 1) not an EOF character.
- EOF in the read operation will most likely indicate that a package was not fully transmitted within the allotted time.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论