英文:
GoLang Pointer Performance
问题
以下代码展示了两个基准测试。第一个基准测试在每次迭代中通过值创建一个结构体,而第二个基准测试则使用指向结构体的指针。
为什么后者要慢20倍?我知道Go语言存在垃圾回收的问题,但逃逸分析不应该处理这些情况吗?
我正在使用go1.4beta1版本,但1.3.3版本给出了相同-错误的不同结果。
有什么想法吗?
package main
import "testing"
type Adder struct {
vals []int
}
func (a *Adder) add() int {
return a.vals[0] + a.vals[1]
}
func BenchmarkWithoutPointer(b *testing.B) {
accum := 0
for i := 0; i < b.N; i++ {
adder := Adder{[]int{accum, i}}
accum = adder.add()
}
_ = accum
}
func BenchmarkWithPointer(b *testing.B) {
accum := 0
for i := 0; i < b.N; i++ {
adder := &Adder{[]int{accum, i}}
accum = adder.add()
}
_ = accum
}
go1.4.1基准测试结果:
$ go test -bench=.
testing: warning: no tests to run
PASS
BenchmarkWithoutPointer 1000000000 2.92 ns/op
BenchmarkWithPointer 30000000 57.8 ns/op
ok github.com/XXXXXXXXXX/bench/perf 5.010s
go1.3.3基准测试结果:
testing: warning: no tests to run
PASS
BenchmarkWithoutPointer 500000000 7.89 ns/op
BenchmarkWithPointer 50000000 37.5 ns/op
ok
编辑:
结论:
正如Ainar-G所说,第二个基准测试中的[]int逃逸到了堆上。在阅读了关于1.4beta1的更多信息后,似乎在访问由新的GC计划引起的堆时引入了新的写屏障。但原始执行似乎有所增加。期待1.5版本的发布 =)。
英文:
The following code shows two benchmarks. The first one creates a struct by value in each iteration, while the second one does use a pointer to the struct.
Why is the latter 20x slower ?? I know about GC issues with GoLang, but shouldn't escape analysis handle those situations ?
I'm using go1.4beta1, but 1.3.3 gave me the [same - wrong] different results.
Any idea ?
package main
import "testing"
type Adder struct {
vals []int
}
func (a *Adder) add() int {
return a.vals[0] + a.vals[1]
}
func BenchmarkWithoutPointer(b *testing.B) {
accum := 0
for i := 0; i < b.N; i++ {
adder := Adder{[]int{accum, i}}
accum = adder.add()
}
_ = accum
}
func BenchmarkWithPointer(b *testing.B) {
accum := 0
for i := 0; i < b.N; i++ {
adder := &Adder{[]int{accum, i}}
accum = adder.add()
}
_ = accum
}
Benchmark go1.4.1:
$ go test -bench=.
testing: warning: no tests to run
PASS
BenchmarkWithoutPointer 1000000000 2.92 ns/op
BenchmarkWithPointer 30000000 57.8 ns/op
ok github.com/XXXXXXXXXX/bench/perf 5.010s
Benchmark go1.3.3:
testing: warning: no tests to run
PASS
BenchmarkWithoutPointer 500000000 7.89 ns/op
BenchmarkWithPointer 50000000 37.5 ns/op
ok
EDIT:
Conclusion:
As Ainar-G said, the []int does escape to heap in the second benchmark. After reading a bit more about 1.4beta1 it seems, that new write barriers are introduced when accessing the heap caused by the new GC plans. But raw execution seems to have increased. Looking forward to 1.5 =).
答案1
得分: 12
使用-m
gcflag运行基准测试会得到以下可能的答案:
./main_test.go:16: BenchmarkWithoutPointer []int字面量不逃逸
(...)
./main_test.go:25: []int字面量逃逸到堆上
在第二个示例中,你的[]int
逃逸到了堆上,这比栈上的速度要慢。如果你使用单独的x
和y
字段作为参数,而不是切片:
type Adder struct {
x, y int
}
func (a *Adder) add() int {
return a.x + a.y
}
基准测试将显示预期的行为:
BenchmarkWithoutPointer 1000000000 2.27 ns/op
BenchmarkWithPointer 2000000000 1.98 ns/op
英文:
Running the benchmark with the -m
gcflag gives the possible answer:
./main_test.go:16: BenchmarkWithoutPointer []int literal does not escape
(...)
./main_test.go:25: []int literal escapes to heap
Your []int
in the second example escapes to heap, which is slower than stack. If you use separate x
and y
fields for your arguments instead of a slice
type Adder struct {
x, y int
}
func (a *Adder) add() int {
return a.x + a.y
}
the benchmark shows the expected behaviour:
BenchmarkWithoutPointer 1000000000 2.27 ns/op
BenchmarkWithPointer 2000000000 1.98 ns/op
答案2
得分: 2
使用go1.16.7运行原始帖子中的确切代码,指针版本现在的速度大致相同(稍微更快)。
$ go test -bench=.
goos: linux
goarch: amd64
pkg: example.com/x
cpu: Intel(R) Core(TM) i7-9850H CPU @ 2.60GHz
BenchmarkWithoutPointer-12 945450447 1.212 ns/op
BenchmarkWithPointer-12 965921562 1.199 ns/op
PASS
ok example.com/x 2.562s
所以编译器在OP提出问题后变得更加智能
英文:
running the exact code from the original post with go1.16.7 the pointer version is now about the same speed (slightly faster)
$ go test -bench=.
goos: linux
goarch: amd64
pkg: example.com/x
cpu: Intel(R) Core(TM) i7-9850H CPU @ 2.60GHz
BenchmarkWithoutPointer-12 945450447 1.212 ns/op
BenchmarkWithPointer-12 965921562 1.199 ns/op
PASS
ok example.com/x 2.562s
so the compiler has gotten smarter since OP posed his question
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论