英文:
fatal error out of memory
问题
所以我用Go语言编写了一个守护进程,处理大约800k个文档,但是我遇到了内存不足的问题。
从我观察到的情况来看,从mongodb获取文档时,内存使用量会随着每次循环而增加。
func main() {
session, err := mgo.Dial("localhost")
if err != nil { panic(err) }
defer session.Close()
subscriptionsC := session.DB("sm").C("subscriptions")
subscriptions := []Subscription{}
for {
subscriptions = GetSubscriptions()
另一个函数是:
func GetSubscriptions()([]Subscription) {
result := []Subscription{}
err := subscriptionsC.Find(nil).Prefetch(0.0).All(&result)
if err != nil { Log("signups_err", err.Error() + "\n") }
return result
}
我不知道是不是每次循环都重新声明了数组,或者发生了什么情况。
非常感谢任何帮助。
英文:
So I wrote a daemon in go that handles around 800k documents and i'm having an out of memory problem.
From what i saw when getting the documents from mongodb the memory usage increases with every loop.
func main() {
session, err := mgo.Dial("localhost")
if err != nil { panic(err) }
defer session.Close()
subscriptionsC = session.DB("sm").C("subscriptions")
subscriptions := []Subscription{}
for {
subscriptions = GetSubscriptions()
And the other function is:
func GetSubscriptions()([]Subscription) {
result := []Subscription{}
err := subscriptionsC.Find(nil).Prefetch(0.0).All(&result)
if err != nil { Log("signups_err", err.Error() + "\n") }
return result
}
I don't know if it's redeclaring the array with each loop or what exactly happens.
Any help would be greatly appreciated.
答案1
得分: 1
mgo的作者在这里。
你的代码没有问题,但是它是不完整的,所以有可能你没有展示的部分实际上存在内存泄漏。
你能提供一个完整的会泄漏内存的示例吗?
顺便说一下,缓存/池化会话是没有意义的,因为mgo内部会为你处理资源的池化。你必须确保关闭你创建的会话,这个示例代码已经做到了。
在下面的评论中更新:
> 似乎问题出在文档数量过多。pastebin.com/jUDmbS4z 这个代码每10-15分钟会崩溃一次(大约4-5次循环)。它在一个循环中从Mongo中获取大约60万个文档。
是的,一次性加载大量数据的查询可能会引发各种与mgo无关的问题,比如内存碎片化、不精确的垃圾收集器等。像往常一样,按照数据到达的顺序迭代处理项目;这样做既方便又快速,并且会大大减少内存使用量,正如你已经发现的那样。
英文:
Author of mgo here.
There's nothing wrong with your code, but it's incomplete, so it's always possible that something you're not showing is in fact leaking memory.
Can you provide a full example that leaks memory?
There's no point in caching/pooling sessions, by the way, because mgo internally handles pooling of resources for you. What you must do is to make sure you close the sessions you create, which the sample code does.
Update after OP's comment below:
> Seems that the problem is with a high amount of docs. pastebin.com/jUDmbS4z this will crash once every 10-15 mins (around 4-5 loops). It's getting around 600k docs from mongo in one loop.
Yeah, running queries that load a ridiculous amount of data in memory at once can easily create trouble for a number of reasons unrelated to mgo.. memory fragmentation, non-precise collector, etc. Just iterate over the items as they arrive as usual; it is comfortable, fast, and will dramatically reduce the amount of memory used, as you already figured.
答案2
得分: 0
数组在每次循环中都被初始化,因为调用了GetSubscriptions()
,然后在循环内部result := []Subscription{}
,但我认为这不是问题的根源。
问题可能来自于你的全局会话,参见Web应用程序中的数据库连接,正确的方法是使用会话池。
编辑:还请参见如何从处理程序调用mongoDB CRUD方法?
英文:
The array is definietly being inialized in every loop because of the call to GetSubscriptions()
and then inside the loop result := []Subscription{}
, but I think that's not the source of the problem.
The problem could be coming from your global session, see Database connections in web applications, The proper way would be by using a session pool.
Edit: also see How do I call mongoDB CRUD method from handler?
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论