在Go程序中的一个函数中存在神秘且过度的内存分配。

huangapple go评论100阅读模式
英文:

Mysterious and Excessive memory allocation in a function in a go program

问题

我有以下代码,它使用了大量的内存,远远超出了预期。我使用了pprof工具,它显示函数NewEdge分配了程序分配的内存的94%以上。

我的问题是,这段代码有什么问题,导致它使用了这么多内存:

  1. type Vertex struct {
  2. Id string `json:"id"` // must be unique
  3. Properties map[string]string `json:"properties"` // to be implemented soon
  4. verticesThisIsConnectedTo map[string][]string `json:"-"` //id for the edges *Edge // keys are Vertex ids, each pair of vertices can be connected to each other with multiple edges
  5. verticesConnectedToThis map[string][]string `json:"_"` //id for the edges *Edge // keys are Vertex ids,
  6. }
  7. type Edge struct {
  8. id string `json:"-"` // for internal use, unique
  9. Label string `json:"label"`
  10. SourceId string `json:"source-id"`
  11. TargetId string `json:"terget-id"`
  12. Type string `json:"type"`
  13. Properties map[string]string `json:"properties"` // to be implemented soon
  14. }
  15. func (v *Vertex) isPartof(g *Graph) bool {
  16. _, b := g.Vertices[v.Id]
  17. return b
  18. }
  19. func (g *Graph) NewEdge(source, target *Vertex, label, edgeType string) (Edge, error) {
  20. if source.Id == target.Id {
  21. return Edge{}, ERROR_NO_EDGE_TO_SELF_ALLOWED
  22. }
  23. if !source.isPartof(g) || !target.isPartof(g) {
  24. return Edge{}, errors.New("InvalidEdge, source or target not in this graph")
  25. }
  26. e := Edge{id: <-nextId, Label: label, SourceId: source.Id, TargetId: target.Id, Type: edgeType}
  27. g.Edges[e.id] = &e
  28. source.verticesThisIsConnectedTo[target.Id] = append(source.verticesThisIsConnectedTo[target.Id], e.id)
  29. target.verticesConnectedToThis[source.Id] = append(target.verticesConnectedToThis[source.Id], e.id)
  30. return e, nil
  31. }
  32. func fakeGraph(g Graph, nodesCount, followratio int) error {
  33. var err error
  34. // create the vertices
  35. for i := 0; i < nodesCount; i++ {
  36. v := NewVertex("")
  37. g.AddVertex(v)
  38. }
  39. // create some "follow edges"
  40. followcount := followratio * nodesCount / 100
  41. vkeys := []string{}
  42. for pk := range g.Vertices {
  43. vkeys = append(vkeys, pk)
  44. }
  45. for ki := range g.Vertices {
  46. pidx := rand.Perm(nodesCount)
  47. followcounter := followcount
  48. for j := 0; j < followcounter; j++ {
  49. _, err := g.NewEdge(g.Vertices[ki], g.Vertices[vkeys[pidx[j]]], <-nextId, EDGE_TYPE_FOLLOW)
  50. if err != nil {
  51. followcounter++ // to compensate for references to self
  52. }
  53. }
  54. }
  55. return err
  56. }

分配发生在这样一个调用中:fakeGraph(Aragog, 2000, 1),其中:

  1. func fakeGraph(g Graph, nodesCount, followratio int) error {
  2. // ...
  3. }

问题/疑问:

我可以创建成千上万个Vertex,内存使用量非常合理。但是调用NewEdge非常消耗内存。我首先注意到代码使用了大量的内存。我使用pprof-memprofile运行了go tool pprof,得到了以下结果:

  1. (pprof) top10
  2. Total: 9.9 MB
  3. 8.9 89.9% 89.9% 8.9 89.9% main.(*Graph).NewEdge
  4. 0.5 5.0% 95.0% 0.5 5.0% allocg
  5. 0.5 5.0% 100.0% 0.5 5.0% fmt.Sprintf
  6. 0.0 0.0% 100.0% 0.5 5.0% _rt0_go
  7. 0.0 0.0% 100.0% 8.9 89.9% main.fakeGraph
  8. 0.0 0.0% 100.0% 0.5 5.0% main.func·003
  9. 0.0 0.0% 100.0% 8.9 89.9% main.main
  10. 0.0 0.0% 100.0% 0.5 5.0% mcommoninit
  11. (pprof)

非常感谢任何帮助。

英文:

I have the following code, which uses tones of memory, which is way higher than expected.
I used to pprof tool and it shows that the function NewEdge is allocating more than 94% of all the memory allocated by the program.

My question is, what is wrong with this code, that is uses so much memory:

  1. type Vertex struct {
  2. Id string `json:&quot;id&quot;` // must be unique
  3. Properties map[string]string `json:&quot;properties&quot;` // to be implemented soon
  4. verticesThisIsConnectedTo map[string][]string `json:&quot;-&quot;` //id for the edges *Edge // keys are Vertex ids, each pair of vertices can be connected to each other with multiple edges
  5. verticesConnectedToThis map[string][]string `json:&quot;_&quot;` //id for the edges *Edge // keys are Vertex ids,
  6. }
  7. type Edge struct {
  8. id string `json:&quot;-&quot;` // for internal use, unique
  9. Label string `json:&quot;label&quot;`
  10. SourceId string `json:&quot;source-id&quot;`
  11. TargetId string `json:&quot;terget-id&quot;`
  12. Type string `json:&quot;type&quot;`
  13. Properties map[string]string `json:&quot;properties&quot;` // to be implemented soon
  14. }
  15. func (v *Vertex) isPartof(g *Graph) bool {
  16. _, b := g.Vertices[v.Id]
  17. return b
  18. }
  19. func (g *Graph) NewEdge(source, target *Vertex, label, edgeType string) (Edge, error) {
  20. if source.Id == target.Id {
  21. return Edge{}, ERROR_NO_EDGE_TO_SELF_ALLOWED
  22. }
  23. if !source.isPartof(g) || !target.isPartof(g) {
  24. return Edge{}, errors.New(&quot;InvalidEdge, source or target not in this graph&quot;)
  25. }
  26. e := Edge{id: &lt;-nextId, Label: label, SourceId: source.Id, TargetId: target.Id, Type: edgeType}
  27. g.Edges[e.id] = &amp;e
  28. source.verticesThisIsConnectedTo[target.Id] = append(source.verticesThisIsConnectedTo[target.Id], e.id)
  29. target.verticesConnectedToThis[source.Id] = append(target.verticesConnectedToThis[source.Id], e.id)
  30. return e, nil
  31. }

The allocation happens by a call like this: fakeGraph(Aragog, 2000, 1) where :

  1. func fakeGraph(g Graph, nodesCount, followratio int) error {
  2. var err error
  3. // create the vertices
  4. for i := 0; i &lt; nodesCount; i++ {
  5. v := NewVertex(&quot;&quot;) //FH.RandStr(10))
  6. g.AddVertex(v)
  7. }
  8. // create some &quot;follow edges&quot;
  9. followcount := followratio * nodesCount / 100
  10. vkeys := []string{}
  11. for pk := range g.Vertices {
  12. vkeys = append(vkeys, pk)
  13. }
  14. for ki := range g.Vertices {
  15. pidx := rand.Perm(nodesCount)
  16. followcounter := followcount
  17. for j := 0; j &lt; followcounter; j++ {
  18. _, err := g.NewEdge(g.Vertices[ki], g.Vertices[vkeys[pidx[j]]], &lt;-nextId, EDGE_TYPE_FOLLOW)
  19. if err != nil {
  20. followcounter++ // to compensate for references to self
  21. }
  22. }
  23. }
  24. return err
  25. }

Question / mystery :

I can create thousands of Vertexs and the memory usage is very reasonable. But calls to NewEdge are very memory intensive. I first noticed that the code was using tones of memory. I ran pprof with -memprofile and then used go tool pprof and got this:

  1. (pprof) top10
  2. Total: 9.9 MB
  3. 8.9 89.9% 89.9% 8.9 89.9% main.(*Graph).NewEdge
  4. 0.5 5.0% 95.0% 0.5 5.0% allocg
  5. 0.5 5.0% 100.0% 0.5 5.0% fmt.Sprintf
  6. 0.0 0.0% 100.0% 0.5 5.0% _rt0_go
  7. 0.0 0.0% 100.0% 8.9 89.9% main.fakeGraph
  8. 0.0 0.0% 100.0% 0.5 5.0% main.func&#183;003
  9. 0.0 0.0% 100.0% 8.9 89.9% main.main
  10. 0.0 0.0% 100.0% 0.5 5.0% mcommoninit
  11. (pprof)

Any help is very much appreciated.

答案1

得分: 1

@ali 我认为这个内存分析中没有什么神秘的地方。
首先,如果你检查结构体的大小,你会发现 Edge 结构体比 Vertex 结构体大两倍。(你可以通过 unsafe.Sizeof() 来检查结构体的大小)
所以,如果你调用 fakeGraph(Aragog, 2000, 1),Go 会分配:

  • 2000 个 Vertex 结构体
  • 至少 2000 * 20 = 40,000 个 Edge 结构体
    可以看到,NewEdge() 分配的内存至少是 fakeGraph() 的 40 倍。

此外,每次你尝试创建新的边,都会分配一个新的 Edge 结构体,即使 NewEdge() 返回错误。

另一个因素是,你返回的是结构体本身,而不是结构体的指针。在 Go 中,结构体是值类型,所以一旦你从 NewEdge() 返回,整个结构体将被复制,这也可能导致新的内存分配。
是的,我知道你从不使用返回的结构体,但我不确定 Go 编译器是否会检查调用者的上下文并跳过 Edge 的复制。

英文:

@ali I think there is no mystery in this memory profiling.
First of all, If you check size of your structs you will see what Edge struct is 2 times bigger than Vertex struct. (you can check size of structs by unsafe.Sizeof())
So, if you will call fakeGraph(Aragog, 2000, 1) Go will allocate:

  • 2000 Vertex structs
  • at least 2000 * 20 = 40 000 Edge structs
    As you can see NewEdge() will allocate at least 40 times more memory then fakeGraph()

Also, every time you will try to create new edge, new Edge struct will allocated - even if NewEdge() return error.

Another factor is - you return struct itself, not pointer to struct. In Go struct is value types, so entire struct will be copied once you will return from NewEdge() and it also can cause new allocation.
Yes, I see what you never use returned struct, but I'm not sure if Go compiler will check caller's context and skip Edge copying

huangapple
  • 本文由 发表于 2014年6月20日 17:57:19
  • 转载请务必保留本文链接:https://go.coder-hub.com/24324615.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定