英文:
Golang writing to http response breaks input reading?
问题
我正在尝试使用Go编写一个小型的Web应用程序,用户可以在多部分表单中上传一个gzip压缩文件。该应用程序会解压缩并解析该文件,并将一些输出写入响应中。然而,当我开始向响应中写入内容时,我不断遇到输入流损坏的错误。如果不向响应中写入内容,问题就会解决,或者如果从非gzip压缩的输入流中读取也可以解决。下面是一个示例的HTTP处理程序:
func(w http.ResponseWriter, req *http.Request) {
// 从多部分读取器中获取输入流并使用扫描器读取
multiReader, _ := req.MultipartReader()
part, _ := multiReader.NextPart()
gzipReader, _ := gzip.NewReader(part)
scanner := bufio.NewScanner(gzipReader)
// 将从输入流中读取的字符串发送到该通道
inputChan := make(chan string, 1000)
// 在该通道上发送完成信号
donechan := make(chan bool, 1)
// 这个goroutine只是从输入扫描器中读取文本并发送到通道中
go func() {
for scanner.Scan() {
inputChan <- scanner.Text()
}
close(inputChan)
}()
// 从输入通道中读取行。它们要么以#开头,要么有十个以制表符分隔的列
go func() {
for line := range inputChan {
toks := strings.Split(line, "\t")
if len(toks) != 10 && line[0] != '#' {
panic("Dang.")
}
}
donechan <- true
}()
// 定期向响应中写入一些随机文本
go func() {
for {
time.Sleep(10*time.Millisecond)
w.Write([]byte("write\n some \n output\n"))
}
}()
// 等待直到完成后再返回
<-donechan
}
奇怪的是,这段代码每次都会引发panic,因为它总是遇到少于10个标记的行,尽管每次出错的位置都不同。注释掉写入响应的那行代码可以解决这个问题,或者从非gzip压缩的输入流中读取也可以解决。我是否遗漏了一些明显的东西?为什么从gzip文件中读取时向响应写入内容会出错,而从纯文本格式的文件中读取则不会出错?为什么会出错呢?
英文:
I'm attempting to write a small webapp in Go where the user uploads a gzipped file in a multipart form. The app unzips and parses the file and writes some output to the response. However, I keep running into an error where the input stream looks corrupted when I begin writing to the response. Not writing to the response fixes the problem, as does reading from a non-gzipped input stream. Here's an example http handler:
func(w http.ResponseWriter, req *http.Request) {
//Get an input stream from the multipart reader
//and read it using a scanner
multiReader, _ := req.MultipartReader()
part, _ := multiReader.NextPart()
gzipReader, _ := gzip.NewReader(part)
scanner := bufio.NewScanner(gzipReader)
//Strings read from the input stream go to this channel
inputChan := make(chan string, 1000)
//Signal completion on this channel
donechan := make(chan bool, 1)
//This goroutine just reads text from the input scanner
//and sends it into the channel
go func() {
for scanner.Scan() {
inputChan <- scanner.Text()
}
close(inputChan)
}()
//Read lines from input channel. They all either start with #
//or have ten tab-separated columns
go func() {
for line := range inputChan {
toks := strings.Split(line, "\t")
if len(toks) != 10 && line[0] != '#' {
panic("Dang.")
}
}
donechan <- true
}()
//periodically write some random text to the response
go func() {
for {
time.Sleep(10*time.Millisecond)
w.Write([]byte("write\n some \n output\n"))
}
}()
//wait until we're done to return
<-donechan
}
Weirdly, this code panics every time because it always encounters a line with fewer than 10 tokens, although at different spots every time. Commenting out the line that writes to the response fixes the issue, as does reading from a non-gzipped input stream. Am I missing something obvious? Why would writing to the response break if reading from a gzip file, but not a plain text formatted file? Why would it break at all?
答案1
得分: 1
HTTP协议不是全双工的,它是基于请求-响应的。只有在读取输入完成后,才应该发送输出。
在你的代码中,你在一个通道上使用了for
和range
。这将尝试读取通道,直到它被关闭,但你从未关闭inputChan
。
如果你从未关闭inputChan
,则下面的代码将永远不会执行:
donechan <- true
因此,从donechan
接收数据会被阻塞:
<-donechan
当到达EOF时,你必须关闭inputChan
:
go func() {
for scanner.Scan() {
inputChan <- scanner.Text()
}
close(inputChan) // 这是必需的
}()
英文:
The HTTP protocol is not full-duplex: it is request-response based. You should only send output once you're done with reading the input.
In your code you use a for
with range
on a channel. This will try to read the channel until it is closed, but you never close the inputChan
.
If you never close inputChan
, the following line is never reached:
donechan <- true
And therefore receiving from donechan
blocks:
<-donechan
You have to close the inputChan
when EOF is reached:
go func() {
for scanner.Scan() {
inputChan <- scanner.Text()
}
close(inputChan) // THIS IS NEEDED
}()
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论