英文:
What is the meaning of 'leak/leaking param' in Golang Escape Analysis
问题
func main() {
i1 := 1
A1(&i1)
}
func A1(i1 *int) *int {
return i1
}
逃逸分析的结果是:
./main.go:18:9: parameter i1 leaks to ~r1 with derefs=0:
./main.go:18:9: flow: ~r1 = i1:
./main.go:18:9: from return i1 (return) at ./main.go:19:2
./main.go:18:9: leaking param: i1 to result ~r1 level=0
parameter i1 leaks to ~r1 with derefs=0
和 leaking param: i1 to result ~r1 level=0
的意思是什么?
首先,我尝试在谷歌上搜索 golang escape leaking
,最相关的结果是在 escape-analysis-shows-channel-as-leaking-param 的评论中。
> "为什么你会这么认为?" 可以合理地假设泄漏是不好的,并与其词源 leak 相关。我很难想出一个泄漏是好事的例子,比如漏桶、漏气的油箱、小便、漏电的电容器、漏水的船、泄漏的抽象。对于高性能的 Go 专家来说可能很明显,但对于我们其他人来说,链接到文档并简要解释泄漏参数的含义会很有帮助。
这是我想要问的同样的问题,但之后没有更多的回复了。
然后,我尝试阅读打印这些结果的源代码。
在 compile/internal/escape/leaks.go 中,我找到了以下注释:
> // An leaks represents a set of assignment flows from a parameter
>
> // to the heap or to any of its function's (first numEscResults)
>
> // result parameters.
但我无法理解这个注释,是否有官方文档来解释它呢?
此外,在源代码中我发现了另一个问题。
如果在运行时,numEscResults(7)
后的结果参数会逃逸到堆上吗?
func main() {
i1, i2, i3, i4, i5, i6, i7, i8, i9 := 1, 1, 1, 1, 1, 1, 1, 1, 1
A1(&i1, &i2, &i3, &i4, &i5, &i6, &i7, &i8, &i9)
return
}
func A1(i1, i2, i3, i4, i5, i6, i7, i8, i9 *int) (*int, *int, *int, *int, *int, *int, *int, *int, *int) {
return i1, i2, i3, i4, i5, i6, i7, i8, i9
}
...一些重复的输出
./main.go:16:13: leaking param: i2 to result ~r10 level=0
./main.go:16:17: leaking param: i3 to result ~r11 level=0
./main.go:16:21: leaking param: i4 to result ~r12 level=0
./main.go:16:25: leaking param: i5 to result ~r13 level=0
./main.go:16:29: leaking param: i6 to result ~r14 level=0
./main.go:16:33: leaking param: i7 to result ~r15 level=0
./main.go:16:37: leaking param: i8
./main.go:16:41: leaking param: i9
./main.go:8:30: i8 escapes to heap:
./main.go:8:30: flow: {heap} = &i8:
./main.go:8:30: from &i8 (address-of) at ./main.go:9:40
./main.go:8:30: from A1(&i1, &i2, &i3, &i4, &i5, &i6, &i7, &i8, &i9) (call parameter) at ./main.go:9:4
./main.go:8:34: i9 escapes to heap:
./main.go:8:34: flow: {heap} = &i9:
./main.go:8:34: from &i9 (address-of) at ./main.go:9:45
./main.go:8:34: from A1(&i1, &i2, &i3, &i4, &i5, &i6, &i7, &i8, &i9) (call parameter) at ./main.go:9:4
./main.go:8:30: moved to heap: i8
./main.go:8:34: moved to heap: i9
英文:
func main() {
i1 := 1
A1(&i1)
}
func A1(i1 *int) *int {
return i1
}
And the result of escape analysis is
./main.go:18:9: parameter i1 leaks to \~r1 with derefs=0:
./main.go:18:9: flow: \~r1 = i1:
./main.go:18:9: from return i1 (return) at ./main.go:19:2
./main.go:18:9: leaking param: i1 to result \~r1 level=0
Whats the meaning of parameter i1 leaks to \~r1 with derefs=0
and leaking param: i1 to result \~r1 level=0
First I try to Google golang escape leaking
, the most relevant result is in the comment of escape-analysis-shows-channel-as-leaking-param
> "Why would you think that?" It's reasonable to assume that leaking is bad and related to its stem leak. I am struggling to think of an example context where leaking is a good thing, e.g leaking bucket, leaking gas tank, taking a leak, leaking capacitor, leaky boat, leaky abstraction. It may be obvious to high performance go experts, but for the rest of us it would be helpful to link to docs and provide brief clarification of what leaking param refers to
It is the same question that i want to ask, but no more replies after this.
Then I try to read the source code where print these result.
In compile/internal/escape/leaks.go, i found comment
> // An leaks represents a set of assignment flows from a parameter
>
> // to the heap or to any of its function's (first numEscResults)
>
> // result parameters.
But i can't understand this, is there any official document to represent it.
Besides, in source code I find one more question.
If result parameters after numEscResults(7)
will escape to heap in runtime?
func main() {
i1, i2, i3, i4, i5, i6, i7, i8, i9 := 1, 1, 1, 1, 1, 1, 1, 1, 1
A1(&i1, &i2, &i3, &i4, &i5, &i6, &i7, &i8, &i9)
return
}
func A1(i1, i2, i3, i4, i5, i6, i7, i8, i9 *int) (*int, *int, *int, *int, *int, *int, *int, *int, *int) {
return i1, i2, i3, i4, i5, i6, i7, i8, i9
}
...some duplicate output
./main.go:16:13: leaking param: i2 to result ~r10 level=0
./main.go:16:17: leaking param: i3 to result ~r11 level=0
./main.go:16:21: leaking param: i4 to result ~r12 level=0
./main.go:16:25: leaking param: i5 to result ~r13 level=0
./main.go:16:29: leaking param: i6 to result ~r14 level=0
./main.go:16:33: leaking param: i7 to result ~r15 level=0
./main.go:16:37: leaking param: i8
./main.go:16:41: leaking param: i9
./main.go:8:30: i8 escapes to heap:
./main.go:8:30: flow: {heap} = &i8:
./main.go:8:30: from &i8 (address-of) at ./main.go:9:40
./main.go:8:30: from A1(&i1, &i2, &i3, &i4, &i5, &i6, &i7, &i8, &i9) (call parameter) at ./main.go:9:4
./main.go:8:34: i9 escapes to heap:
./main.go:8:34: flow: {heap} = &i9:
./main.go:8:34: from &i9 (address-of) at ./main.go:9:45
./main.go:8:34: from A1(&i1, &i2, &i3, &i4, &i5, &i6, &i7, &i8, &i9) (call parameter) at ./main.go:9:4
./main.go:8:30: moved to heap: i8
./main.go:8:34: moved to heap: i9
答案1
得分: 8
TLDL: 如果你在追踪分配情况,可以忽略泄漏的参数,而要寻找“移动到堆上”的部分。
“泄漏的参数”意味着这个函数在返回后仍然保持其参数的存活状态,这并不意味着它被移动到堆上,实际上大多数“泄漏的参数”都是在堆栈上分配的。
“r1”指的是函数的返回值,它从0开始,所以“r1”指的是第二个返回值(与提供的示例代码不匹配,应该是r0)。在第一个片段中,它泄漏了“i1”,因为“r0 = i1”,所以第0个函数返回值是“i1”,因此“i1”在返回后必须保持存活状态,并且“泄漏”给调用者。
在编译器输出中,“泄漏的参数”之前的部分是因为 OP 使用了“-m -m”,它会打印数据流图。
关于 deref,在 cmd/compile/internal/escape/escape.go 的注释中写道:
[...] 减去寻址操作的解引用操作的数量被记录为边的权重(称为“derefs”)。
“level”在当前的注释中没有描述,而且我已经很久没有熟悉 gc 源代码了,据我所知,它是内存间接引用的级别,一个间接引用(*)操作会增加,取地址(&)操作会减少,因此这个函数
func A1(a **int) *int {
p := &a
return **p
}
应该给出一个泄漏的参数 a
,级别为1。
另外,在源代码中我还发现了一个问题。如果 numEscResults(7) 后的结果参数在运行时会被移动到堆上吗?
是的,所有依赖于“泄漏的参数”的第7个之后的结果(读取、返回)将被移动到堆上。我不确定为什么是7,但根据我对 gc 源代码的经验,我猜测这个值不会使编译过程变慢太多,同时对大多数函数进行优化。
英文:
> Whats the meaning of parameter i1 leaks to ~r1 with derefs=0 and leaking param: i1 to result ~r1 level=0
TLDL: If you're hunting down allocations, ignore leaking param and look for "moved to heap" pieces.
A "leaking param" means that this function somehow keeps its parameter alive after it returns, this doesn't mean it's being moved to heap, in fact most of "leaking params" are allocated on the stack.
"r1" refers to function's return value, it starts at 0, so "r1" refer to the second return value. (which doesn't match with OP provided sample code, should be r0), in case of the first snippet it's leaking "i1" because "r0 = i1", so the 0'th function return value is "i1" thus "i1" must be kept alive after return, "leaking" to the caller
The piece that comes before the "leaking param" in compiler output is because OP is using '-m -m', which prints the data flow graph.
For deref, from the comment in cmd/compile/internal/escape/escape.go:
> [...] The number of dereference operations minus the number of
> addressing operations is recorded as the edge's weight (termed
> "derefs").
"level" is not described in current comments and it's been a while since I was familiar with gc source code, as far as I can tell it's the level of memory indirections, an indirect (*) operation increments, address-of (&) decrements, thus this function
func A1(a **int) *int {
p := &a
return **p
}
should give a leaking param a
with level=1.
> Besides, in source code I find one more question. If result parameters after numEscResults(7) will escape to heap in runtime?
Yes, all results (read, returns) after the 7'th that depends on leaking params will be moved to heap, I don't know exactly the reason for 7, but I can guess from experience with gc source code it's a value that doesn't slow down compilation too much, yet preserve the optimization for most functions
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论