在Go程序中的一个函数中存在神秘且过度的内存分配。

huangapple go评论80阅读模式
英文:

Mysterious and Excessive memory allocation in a function in a go program

问题

我有以下代码,它使用了大量的内存,远远超出了预期。我使用了pprof工具,它显示函数NewEdge分配了程序分配的内存的94%以上。

我的问题是,这段代码有什么问题,导致它使用了这么多内存:

type Vertex struct {
    Id                        string              `json:"id"`         // must be unique
    Properties                map[string]string   `json:"properties"` // to be implemented soon
    verticesThisIsConnectedTo map[string][]string `json:"-"`          //id for the edges *Edge // keys are Vertex ids, each pair of vertices can be connected to each other with multiple edges
    verticesConnectedToThis   map[string][]string `json:"_"`          //id for the edges *Edge // keys are Vertex ids,
}
type Edge struct {
    id         string            `json:"-"` // for internal use, unique
    Label      string            `json:"label"`
    SourceId   string            `json:"source-id"`
    TargetId   string            `json:"terget-id"`
    Type       string            `json:"type"`
    Properties map[string]string `json:"properties"` // to be implemented soon
}
func (v *Vertex) isPartof(g *Graph) bool {
    _, b := g.Vertices[v.Id]
    return b
}
func (g *Graph) NewEdge(source, target *Vertex, label, edgeType string) (Edge, error) {
    if source.Id == target.Id {
        return Edge{}, ERROR_NO_EDGE_TO_SELF_ALLOWED
    }
    if !source.isPartof(g) || !target.isPartof(g) {
        return Edge{}, errors.New("InvalidEdge, source or target not in this graph")
    }

    e := Edge{id: <-nextId, Label: label, SourceId: source.Id, TargetId: target.Id, Type: edgeType}
    g.Edges[e.id] = &e

    source.verticesThisIsConnectedTo[target.Id] = append(source.verticesThisIsConnectedTo[target.Id], e.id)
    target.verticesConnectedToThis[source.Id] = append(target.verticesConnectedToThis[source.Id], e.id)
    return e, nil
}

func fakeGraph(g Graph, nodesCount, followratio int) error {
    var err error
    // create the vertices
    for i := 0; i < nodesCount; i++ {
        v := NewVertex("")
        g.AddVertex(v)
    }
    // create some "follow edges"
    followcount := followratio * nodesCount / 100
    vkeys := []string{}
    for pk := range g.Vertices {
        vkeys = append(vkeys, pk)
    }
    for ki := range g.Vertices {
        pidx := rand.Perm(nodesCount)
        followcounter := followcount
        for j := 0; j < followcounter; j++ {
            _, err := g.NewEdge(g.Vertices[ki], g.Vertices[vkeys[pidx[j]]], <-nextId, EDGE_TYPE_FOLLOW)
            if err != nil {
                followcounter++ // to compensate for references to self
            }
        }
    }
    return err
}

分配发生在这样一个调用中:fakeGraph(Aragog, 2000, 1),其中:

func fakeGraph(g Graph, nodesCount, followratio int) error {
    // ...
}

问题/疑问:

我可以创建成千上万个Vertex,内存使用量非常合理。但是调用NewEdge非常消耗内存。我首先注意到代码使用了大量的内存。我使用pprof-memprofile运行了go tool pprof,得到了以下结果:

(pprof) top10
Total: 9.9 MB
8.9  89.9%  89.9%      8.9  89.9% main.(*Graph).NewEdge
0.5   5.0%  95.0%      0.5   5.0% allocg
0.5   5.0% 100.0%      0.5   5.0% fmt.Sprintf
0.0   0.0% 100.0%      0.5   5.0% _rt0_go
0.0   0.0% 100.0%      8.9  89.9% main.fakeGraph
0.0   0.0% 100.0%      0.5   5.0% main.func·003
0.0   0.0% 100.0%      8.9  89.9% main.main
0.0   0.0% 100.0%      0.5   5.0% mcommoninit
(pprof)

非常感谢任何帮助。

英文:

I have the following code, which uses tones of memory, which is way higher than expected.
I used to pprof tool and it shows that the function NewEdge is allocating more than 94% of all the memory allocated by the program.

My question is, what is wrong with this code, that is uses so much memory:

type Vertex struct {
Id                        string              `json:&quot;id&quot;`         // must be unique
Properties                map[string]string   `json:&quot;properties&quot;` // to be implemented soon
verticesThisIsConnectedTo map[string][]string `json:&quot;-&quot;`          //id for the edges *Edge // keys are Vertex ids, each pair of vertices can be connected to each other with multiple edges
verticesConnectedToThis   map[string][]string `json:&quot;_&quot;`          //id for the edges *Edge // keys are Vertex ids,
}
type Edge struct {
id         string            `json:&quot;-&quot;` // for internal use, unique
Label      string            `json:&quot;label&quot;`
SourceId   string            `json:&quot;source-id&quot;`
TargetId   string            `json:&quot;terget-id&quot;`
Type       string            `json:&quot;type&quot;`
Properties map[string]string `json:&quot;properties&quot;` // to be implemented soon
}
func (v *Vertex) isPartof(g *Graph) bool {
_, b := g.Vertices[v.Id]
return b
}
func (g *Graph) NewEdge(source, target *Vertex, label, edgeType string) (Edge, error) {
if source.Id == target.Id {
return Edge{}, ERROR_NO_EDGE_TO_SELF_ALLOWED
}
if !source.isPartof(g) || !target.isPartof(g) {
return Edge{}, errors.New(&quot;InvalidEdge, source or target not in this graph&quot;)
}
e := Edge{id: &lt;-nextId, Label: label, SourceId: source.Id, TargetId: target.Id, Type: edgeType}
g.Edges[e.id] = &amp;e
source.verticesThisIsConnectedTo[target.Id] = append(source.verticesThisIsConnectedTo[target.Id], e.id)
target.verticesConnectedToThis[source.Id] = append(target.verticesConnectedToThis[source.Id], e.id)
return e, nil
}

The allocation happens by a call like this: fakeGraph(Aragog, 2000, 1) where :

func fakeGraph(g Graph, nodesCount, followratio int) error {
var err error
// create the vertices
for i := 0; i &lt; nodesCount; i++ {
v := NewVertex(&quot;&quot;) //FH.RandStr(10))
g.AddVertex(v)
}
// create some &quot;follow edges&quot;
followcount := followratio * nodesCount / 100
vkeys := []string{}
for pk := range g.Vertices {
vkeys = append(vkeys, pk)
}
for ki := range g.Vertices {
pidx := rand.Perm(nodesCount)
followcounter := followcount
for j := 0; j &lt; followcounter; j++ {
_, err := g.NewEdge(g.Vertices[ki], g.Vertices[vkeys[pidx[j]]], &lt;-nextId, EDGE_TYPE_FOLLOW)
if err != nil {
followcounter++ // to compensate for references to self
}
}
}
return err
}

Question / mystery :

I can create thousands of Vertexs and the memory usage is very reasonable. But calls to NewEdge are very memory intensive. I first noticed that the code was using tones of memory. I ran pprof with -memprofile and then used go tool pprof and got this:

(pprof) top10
Total: 9.9 MB
8.9  89.9%  89.9%      8.9  89.9% main.(*Graph).NewEdge
0.5   5.0%  95.0%      0.5   5.0% allocg
0.5   5.0% 100.0%      0.5   5.0% fmt.Sprintf
0.0   0.0% 100.0%      0.5   5.0% _rt0_go
0.0   0.0% 100.0%      8.9  89.9% main.fakeGraph
0.0   0.0% 100.0%      0.5   5.0% main.func&#183;003
0.0   0.0% 100.0%      8.9  89.9% main.main
0.0   0.0% 100.0%      0.5   5.0% mcommoninit
(pprof)

Any help is very much appreciated.

答案1

得分: 1

@ali 我认为这个内存分析中没有什么神秘的地方。
首先,如果你检查结构体的大小,你会发现 Edge 结构体比 Vertex 结构体大两倍。(你可以通过 unsafe.Sizeof() 来检查结构体的大小)
所以,如果你调用 fakeGraph(Aragog, 2000, 1),Go 会分配:

  • 2000 个 Vertex 结构体
  • 至少 2000 * 20 = 40,000 个 Edge 结构体
    可以看到,NewEdge() 分配的内存至少是 fakeGraph() 的 40 倍。

此外,每次你尝试创建新的边,都会分配一个新的 Edge 结构体,即使 NewEdge() 返回错误。

另一个因素是,你返回的是结构体本身,而不是结构体的指针。在 Go 中,结构体是值类型,所以一旦你从 NewEdge() 返回,整个结构体将被复制,这也可能导致新的内存分配。
是的,我知道你从不使用返回的结构体,但我不确定 Go 编译器是否会检查调用者的上下文并跳过 Edge 的复制。

英文:

@ali I think there is no mystery in this memory profiling.
First of all, If you check size of your structs you will see what Edge struct is 2 times bigger than Vertex struct. (you can check size of structs by unsafe.Sizeof())
So, if you will call fakeGraph(Aragog, 2000, 1) Go will allocate:

  • 2000 Vertex structs
  • at least 2000 * 20 = 40 000 Edge structs
    As you can see NewEdge() will allocate at least 40 times more memory then fakeGraph()

Also, every time you will try to create new edge, new Edge struct will allocated - even if NewEdge() return error.

Another factor is - you return struct itself, not pointer to struct. In Go struct is value types, so entire struct will be copied once you will return from NewEdge() and it also can cause new allocation.
Yes, I see what you never use returned struct, but I'm not sure if Go compiler will check caller's context and skip Edge copying

huangapple
  • 本文由 发表于 2014年6月20日 17:57:19
  • 转载请务必保留本文链接:https://go.coder-hub.com/24324615.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定