英文:
golang reading XML memory leak?
问题
我们最近使用golang和encoding/xml
解码了很多XML文件。我们注意到,在处理了相当多的文件后,我们的服务器会耗尽内存,开始交换内存,并最终崩溃。因此,我们编写了一个测试程序,如下所示:
package main
import (
"encoding/xml"
"io/ioutil"
"log"
"time"
)
// 这个XML用于读取AWS SQS消息
type message struct {
Body []string `xml:"ReceiveMessageResult>Message>Body"`
ReceiptHandle []string `xml:"ReceiveMessageResult>Message>ReceiptHandle"`
}
func main() {
var m message
readTicker := time.NewTicker(5 * time.Millisecond)
body, err := ioutil.ReadFile("test.xml")
for {
select {
case <-readTicker.C:
err = xml.Unmarshal(body, &m)
if err != nil {
log.Println(err.Error())
}
}
}
}
它只是一遍又一遍地解码一个XML文件。我们的服务器显示相同的症状:二进制文件的内存使用量不断增长,直到服务器开始交换内存。
我们还添加了一些性能分析代码,在上述脚本运行20秒后触发,并从pprof
的top100
中得到以下结果:
(pprof) top100
Total: 56.0 MB
55.0 98.2% 98.2% 55.0 98.2% encoding/xml.copyValue
1.0 1.8% 100.0% 1.0 1.8% cnew
0.0 0.0% 100.0% 0.5 0.9% bytes.(*Buffer).WriteByte
0.0 0.0% 100.0% 0.5 0.9% bytes.(*Buffer).grow
0.0 0.0% 100.0% 0.5 0.9% bytes.makeSlice
0.0 0.0% 100.0% 55.5 99.1% encoding/xml.(*Decoder).Decode
...
在内存耗尽之前再次运行此命令,总量会更高,但百分比基本相同。有人能帮助我们吗?我们漏掉了什么?
提前感谢!
<details>
<summary>英文:</summary>
We've been decoding a lot of XML lately using golang and `encoding/xml`. We noticed that, after quite a few files, our boxes run out of memory, start swapping, and generally die an unhappy death. So we made a test program. Here it is:
package main
import (
"encoding/xml"
"io/ioutil"
"log"
"time"
)
// this XML is for reading AWS SQS messages
type message struct {
Body []string `xml:"ReceiveMessageResult>Message>Body"`
ReceiptHandle []string `xml:"ReceiveMessageResult>Message>ReceiptHandle"`
}
func main() {
var m message
readTicker := time.NewTicker(5 * time.Millisecond)
body, err := ioutil.ReadFile("test.xml")
for {
select {
case <-readTicker.C:
err = xml.Unmarshal(body, &m)
if err != nil {
log.Println(err.Error())
}
}
}
}
All it does is repeatedly decode an XML file over and over again. Our boxes show the same symptom: the memory usage of the binary grows without bound, until the box starts swapping.
We also added in some profiling code, which fires after 20s into the above script, and got the following from `pprof`'s `top100`:
(pprof) top100
Total: 56.0 MB
55.0 98.2% 98.2% 55.0 98.2% encoding/xml.copyValue
1.0 1.8% 100.0% 1.0 1.8% cnew
0.0 0.0% 100.0% 0.5 0.9% bytes.(*Buffer).WriteByte
0.0 0.0% 100.0% 0.5 0.9% bytes.(*Buffer).grow
0.0 0.0% 100.0% 0.5 0.9% bytes.makeSlice
0.0 0.0% 100.0% 55.5 99.1% encoding/xml.(*Decoder).Decode
...
Running this later on, before the box runs out of memory, yields a higher total but pretty much the same percentages. Can anyone help us out? What are we missing?
Thanks in advance!
</details>
# 答案1
**得分**: 8
尝试每次打印出你的消息。它将继续将字段附加到原始结构体上。
在你完成对消息的操作后,需要使用`m = message{}`来重置消息,以清除它,否则它将继续增长。
<details>
<summary>英文:</summary>
Try printing out your message each time. It will continue to append the fields onto the original struct.
You need to reset the message with `m = message{}` after you do what you need to do with it, to clear it out, otherwise it will continue to grow.
</details>
# 答案2
**得分**: 1
我还没有测试过这个,但你尝试过每次将XML解组为一个新变量吗?
据我所见,你是在一个指针中进行操作,这可能会导致一些内存问题。
当然,我可能完全错了。
<details>
<summary>英文:</summary>
I haven't tested this yet but did you tried to unmarshal the XML into a new variable every time you do it?
As far as I see you are doing it in a pointer which might create some issues with the memory.
But of course, I might be totally wrong.
</details>
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论