英文:
Golang request.Body.Close() returns an empty Document
问题
我有两个方法在两个不同的包中,其中函数B()接受一个URL,读取网页并返回*html.Tokenizer。但问题是,只有当我注释掉**defer r.Body.Close()**时,它才能正常工作。如果我启用它,从函数B返回的文档将为空。
如果将这两个函数合并为一个函数,它也可以工作。但我需要它们在两个不同的包中。
有什么建议或想法,我在这里漏掉了什么?难道不应该关闭res.Body吗?
func (s ParserService) A(u string) (*domain.Result, error) {
doc, err := s.B("https://www.google.com/")
if err != nil {
fmt.Println(err.Error())
}
for tokenType := doc.Next(); tokenType != html.ErrorToken; {
token := doc.Token()
fmt.Println(token)
tokenType = doc.Next()
}
}
func (c Downloader) B(url string) (*html.Tokenizer, error) {
r, err := c.httpClient.Get(url)
if err != nil {
return nil, err
}
// defer r.Body.Close()
doc := html.NewTokenizer(r.Body)
return doc, nil
}
英文:
I have 2 methods in 2 different packages, where func B() takes a url reads the web page and returns *html.Tokenizer. But the problem is, it is working fine Only when I comment the defer r.Body.Close(), If I enable it this doc returned from func B is empty.
And it also works if both the functions are merged in single function. but I need them in 2 different package.
Any suggestion or Idea that What am I missing here ? shoudn't the res.Body be closed ?
func (s ParserService) A(u string) (*domain.Result, error) {
doc, err := s.B("https://www.google.com/")
if err != nil {
fmt.Println(err.Error())
}
for tokenType := doc.Next(); tokenType != html.ErrorToken; {
token := doc.Token()
fmt.Println(token)
tokenType = doc.Next()
}
}
func (c Downloader) B(url string) (*html.Tokenizer, error) {
r, err := c.httpClient.Get(url)
if err != nil {
return nil, err
}
// defer r.Body.Close()
doc := html.NewTokenizer(r.Body)
return doc, nil
}
答案1
得分: 1
tl;dr
html.Tokenizer
的Next
方法直接从读取器中读取内容。在通过标记器处理完内容之前,不要关闭body。在你的示例中,你应该在同一个函数中执行HTTP请求和对body进行标记化处理,然后取消注释你的延迟关闭操作。
详细说明
html.Tokenizer
接受一个io.Reader
,标记器将从中读取内容,直到收到io.EOF
错误为止。这个"错误"表示没有剩余内容可读,标记器的数据源已经完成。
http.Request.Body
是一个io.ReadCloser
,它是io.Reader
和io.Closer
的组合。在调用Close
之后会发生什么取决于具体的实现,但对于http.Request.Body
来说,在调用Close
之后无法再从读取器中读取字节。
你的问题最终是由于过早关闭了http.Request.Body
(io.ReadCloser
)引起的。
英文:
tl;dr
The html.Tokenier
‘s Next
method reads directly from the reader. Don’t close the body until you’ve finished processing it through the tokenizer. In your example, you should perform the HTTP request and tokenize the body in the same function, then you can uncomment your deferred close.
Details
html.Tokenizer
accepts an io.Reader
from which the tokenizer will read until it receives an io.EOF
error. This "error" indicates that there is nothing left to be read and the tokenizer source is completed.
http.Request.Body
is an io.ReadCloser
which is a combination of an io.Reader
and an io.Closer
. What happens after a call to Close
is implementation specific, however for the http.Request.Body, no more bytes can be read from the reader after close is called.
Your problem is ultimately caused by prematurely closing the http.Request.Body
(io.ReadCloser
).
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论