英文:
Golang profiling - top10 shows only one line with 100%
问题
我试图对我的Go库进行性能分析,找出为什么比同样的C++代码慢这么多的原因。
我有一个简单的基准测试:
func BenchmarkFile(t *testing.B) {
tmpFile, err := ioutil.TempFile("", TMP_FILE_PREFIX)
fw, err := NewFile(tmpFile.Name())
text := []byte("testing")
for i := 0; i < b.N; i++ {
_, err = fw.Write(text)
}
fw.Close()
}
NewFile函数返回一个自定义的Writer,它将数据编码为我们的二进制表示形式,甚至可以对其进行压缩,并写入文件系统。
运行go test -bench . -memprofile mem.out -cpuprofile cpu.out
后,我得到了以下结果:
PASS
BenchmarkFile-16 2000000000 0.20 ns/op
ok .../writer/iowriter 9.074s
然后进行分析:
# go tool pprof cpu.out
Entering interactive mode (type "help" for commands)
(pprof) top10
930ms of 930ms total (100%)
flat flat% sum% cum cum%
930ms 100% 100% 930ms 100%
(pprof)
我甚至尝试编写一个使用我的writer的example.go应用程序,并按照http://blog.golang.org/profiling-go-programs中所示添加了pprof.StartCPUProfile(f)
,但结果相同。
我做错了什么,如何确定我的库的瓶颈在哪里?
提前谢谢你的帮助!
英文:
I try to profiling my go library, to find out what is the cause of being so much slower than same thing in c++.
I have simple benchmark
func BenchmarkFile(t *testing.B) {
tmpFile, err := ioutil.TempFile("", TMP_FILE_PREFIX)
fw, err := NewFile(tmpFile.Name())
text := []byte("testing")
for i := 0; i < b.N; i++ {
_, err = fw.Write(text)
}
fw.Close()
}
NewFile return my custom Writer which encodes data to our binary representation, even compress them, and write to file system.
Running go test -bench . -memprofile mem.out -cpuprofile cpu.out
I get
PASS
BenchmarkFile-16 2000000000 0.20 ns/op
ok .../writer/iowriter 9.074s
Than analysing it
# go tool pprof cpu.out
Entering interactive mode (type "help" for commands)
(pprof) top10
930ms of 930ms total ( 100%)
flat flat% sum% cum cum%
930ms 100% 100% 930ms 100%
(pprof)
I even try to write example.go app which is using my writer, and add pprof.StartCPUProfile(f)
as is shown in http://blog.golang.org/profiling-go-programs but with same result.
What am I doing wrong, and how can I determine what is bottleneck of my lib?
Thank you in advance
答案1
得分: 8
好的,以下是翻译好的内容:
好的,这很简单,我忘记在go工具pprof中添加二进制文件,所以应该是这样的:
# go tool pprof write cpu.out
进入交互模式(输入“help”获取命令列表)
(pprof) top10
总时间 7.38秒中的 7.02秒(占比95.12%)
删除了14个节点(累计时间小于等于0.04秒)
显示32个节点中的前10个(累计时间大于等于0.19秒)
占比 占比% 总占比 累计时间 累计时间%
6.55秒 88.75% 88.75% 6.76秒 91.60% syscall.Syscall
...
在使用基准测试时,也会创建二进制文件,并且使用它会得到相同的结果。
英文:
Ok it's easy, I miss to add binary to go tool pprof, si it has to be
# go tool pprof write cpu.out
Entering interactive mode (type "help" for commands)
(pprof) top10
7.02s of 7.38s total (95.12%)
Dropped 14 nodes (cum <= 0.04s)
Showing top 10 nodes out of 32 (cum >= 0.19s)
flat flat% sum% cum cum%
6.55s 88.75% 88.75% 6.76s 91.60% syscall.Syscall
...
and when using benchmark tests, binary is created there and using it gives same result.
答案2
得分: 1
进一步解释sejvolnd的答案:
pprof
需要将生成 cpu.out
文件的二进制文件作为第一个参数。
因此,你需要将命令运行为 go tool pprof <你的程序的go二进制文件> <生成的性能分析输出文件>
。
例如:go tool pprof go_binary cpu.pprof
。
英文:
To expand on sejvolnd's answer:
pprof
needs the binary that actually generated cpu.out
file as a first argument.
So you need to run the command as go tool pprof <go binary of your program> <generaged profiling output file>
e.g. go tool pprof go_binary cpu.pprof
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论