英文:
Extending Golang's http.Resp.Body to handle large files
问题
我有一个客户端应用程序,它将完整的 HTTP 响应主体读入缓冲区并对其进行一些处理:
body, _ = ioutil.ReadAll(containerObject.Resp.Body)
问题是,这个应用程序运行在嵌入式设备上,因此过大的响应会填满设备的 RAM,导致 Ubuntu 终止该进程。
为了避免这种情况,我会检查 content-length 头部,并在文档过大时绕过处理。然而,一些服务器(比如微软)发送非常大的 HTML 响应,但没有设置 content-length,导致设备崩溃。
我唯一能想到的解决方法是读取响应主体的一部分。如果达到了这个限制,那么可以创建一个新的读取器,首先从内存缓冲区中流式传输数据,然后继续从原始的 Resp.Body 中读取。理想情况下,我希望将这个新的读取器分配给 containerObject.Resp.Body,这样调用者就不会察觉到任何区别。
我对 Go 语言还不熟悉,不确定如何编写这段代码。如果有任何建议或替代方案,我将非常感激。
编辑 1:调用者期望一个 Resp.Body 对象,因此解决方案需要与该接口兼容。
编辑 2:我不能解析文档的小块。要么处理整个文档,要么将其不加修改地传递给调用者,而不将其加载到内存中。
英文:
I have a client application which reads in the full body of a http response into a buffer and performs some processing on it:
body, _ = ioutil.ReadAll(containerObject.Resp.Body)
The problem is that this application runs on an embedded device, so responses that are too large fill up the device RAM, causing Ubuntu to kill the process.
To avoid this, I check the content-length header and bypass processing if the document is too large. However, some servers (I'm looking at you, Microsoft) send very large html responses without setting content-length and crash the device.
The only way I can see of getting around this is to read the response body up to a certain length. If it reaches this limit, then a new reader could be created which first streams the in-memory buffer, then continues reading from the original Resp.Body. Ideally, I would assign this new reader to the containerObject.Resp.Body so that callers would not know the difference.
I'm new to GoLang and am not sure how to go about coding this. Any suggestions or alternative solutions would be greatly appreciated.
Edit 1: The caller expects a Resp.Body object, so the solution needs to be compatible with that interface.
Edit 2: I cannot parse small chunks of the document. Either the entire document is processed or it is passed unchanged to the caller, without loading it into memory.
答案1
得分: 3
如果你需要读取响应体的一部分,并在其他调用者中重新构建它,你可以使用 io.MultiReader
和 ioutil.NopCloser
的组合。
resp, err := http.Get("http://google.com")
if err != nil {
return err
}
defer resp.Body.Close()
part, err := ioutil.ReadAll(io.LimitReader(resp.Body, maxReadSize))
if err != nil {
return err
}
// 对 part 进行处理
// 将缓冲的响应体部分与剩余的流重新组合
resp.Body = ioutil.NopCloser(io.MultiReader(bytes.NewReader(part), resp.Body))
// 对完整的 Response.Body 进行处理,作为 io.Reader 使用
如果你不能使用 defer resp.Body.Close()
,因为你打算在完全读取响应体之前返回响应,你需要修改替代的响应体,以便 Close()
方法应用于原始的响应体。不要使用 ioutil.NopCloser
作为 io.ReadCloser
,而是创建一个自己的结构体,引用正确的方法调用。
type readCloser struct {
io.Closer
io.Reader
}
resp.Body = readCloser{
Closer: resp.Body,
Reader: io.MultiReader(bytes.NewReader(part), resp.Body),
}
英文:
If you need to read part of the response body, then reconstruct it in place for other callers, you can use a combination of an io.MultiReader
and ioutil.NopCloser
resp, err := http.Get("http://google.com")
if err != nil {
return err
}
defer resp.Body.Close()
part, err := ioutil.ReadAll(io.LimitReader(resp.Body, maxReadSize))
if err != nil {
return err
}
// do something with part
// recombine the buffered part of the body with the rest of the stream
resp.Body = ioutil.NopCloser(io.MultiReader(bytes.NewReader(part), resp.Body))
// do something with the full Response.Body as an io.Reader
If you can't defer resp.Body.Close()
because you intend to return the response before it's read in its entirety, you will need to augment the replacement body so that the Close()
method applies to the original body. Rather than using the ioutil.NopCloser
as the io.ReadCloser
, create your own that refers to the correct method calls.
type readCloser struct {
io.Closer
io.Reader
}
resp.Body = readCloser{
Closer: resp.Body,
Reader: io.MultiReader(bytes.NewReader(part), resp.Body),
}
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论