使用pprof工具如何对基准进行分析?

huangapple go评论102阅读模式
英文:

How to profile benchmarks using the pprof tool?

问题

我想对由go test -c生成的基准进行分析,但是go tool pprof通常需要一个在主函数内生成的分析文件,类似于这样

func main() {
    flag.Parse()
    if *cpuprofile != "" {
        f, err := os.Create(*cpuprofile)
        if err != nil {
            log.Fatal(err)
        }
        pprof.StartCPUProfile(f)
        defer pprof.StopCPUProfile()
    }
}

如何在我的基准中创建一个分析文件呢?

英文:

I want to profile my benchmarks generated by go test -c, but the go tool pprof needs a profile file usually generated inside the main function like this:

func main() {
    flag.Parse()
    if *cpuprofile != "" {
        f, err := os.Create(*cpuprofile)
        if err != nil {
            log.Fatal(err)
        }
        pprof.StartCPUProfile(f)
        defer pprof.StopCPUProfile()
    }

How can I create a profile file within my benchmarks ?

答案1

得分: 19

根据 https://pkg.go.dev/cmd/go#hdr-Testing_flags 中的描述,你可以使用 -cpuprofile 标志来指定配置文件的位置。

例如:

go test -cpuprofile cpu.out
英文:

As described in https://pkg.go.dev/cmd/go#hdr-Testing_flags you can specify the profile file using the flag -cpuprofile.

For example

go test -cpuprofile cpu.out

答案2

得分: 3

请使用-cpuprofile标志来运行go test,具体的文档可以参考http://golang.org/cmd/go/#hdr-Description_of_testing_flags。

英文:

Use the -cpuprofile flag to go test as documented at http://golang.org/cmd/go/#hdr-Description_of_testing_flags

答案3

得分: 1

这篇文章介绍了如何使用示例来分析基准测试:使用pprof进行基准测试分析

以下是一个模拟CPU工作的基准测试代码:

package main

import (
    "math/rand"
    "testing"
)

func BenchmarkRand(b *testing.B) {
    for n := 0; n < b.N; n++ {
        rand.Int63()
    }
}

要为基准测试生成CPU分析报告,运行以下命令:

go test -bench=BenchmarkRand -benchmem -cpuprofile profile.out

可以使用-memprofile-blockprofile标志生成内存分配和阻塞调用的分析报告。

要分析分析报告,使用Go工具:

go tool pprof profile.out
(pprof) top
Showing nodes accounting for 1.16s, 100% of 1.16s total
Showing top 10 nodes out of 22
      flat  flat%   sum%        cum   cum%
     0.41s 35.34% 35.34%      0.41s 35.34%  sync.(*Mutex).Unlock
     0.37s 31.90% 67.24%      0.37s 31.90%  sync.(*Mutex).Lock
     0.12s 10.34% 77.59%      1.03s 88.79%  math/rand.(*lockedSource).Int63
     0.08s  6.90% 84.48%      0.08s  6.90%  math/rand.(*rngSource).Uint64 (inline)
     0.06s  5.17% 89.66%      1.11s 95.69%  math/rand.Int63
     0.05s  4.31% 93.97%      0.13s 11.21%  math/rand.(*rngSource).Int63
     0.04s  3.45% 97.41%      1.15s 99.14%  benchtest.BenchmarkRand
     0.02s  1.72% 99.14%      1.05s 90.52%  math/rand.(*Rand).Int63
     0.01s  0.86%   100%      0.01s  0.86%  runtime.futex
         0     0%   100%      0.01s  0.86%  runtime.allocm

在这个例子中,瓶颈是互斥锁,由于math/rand中的默认源是同步的。

还可以使用其他的分析报告展示和输出格式,例如tree。输入help获取更多选项。

请注意,基准测试循环之前的任何初始化代码也会被纳入分析报告中。

英文:

This post explains how to profile benchmarks with an example: Benchmark Profiling with pprof.

The following benchmark simulates some CPU work.

package main

import (
    &quot;math/rand&quot;
    &quot;testing&quot;
)

func BenchmarkRand(b *testing.B) {
    for n := 0; n &lt; b.N; n++ {
	    rand.Int63()
    }
}

To generate a CPU profile for the benchmark test, run:

go test -bench=BenchmarkRand -benchmem -cpuprofile profile.out

The -memprofile and -blockprofile flags can be used to generate memory allocation and blocking call profiles.

To analyze the profile use the Go tool:

go tool pprof profile.out
(pprof) top
Showing nodes accounting for 1.16s, 100% of 1.16s total
Showing top 10 nodes out of 22
      flat  flat%   sum%        cum   cum%
     0.41s 35.34% 35.34%      0.41s 35.34%  sync.(*Mutex).Unlock
     0.37s 31.90% 67.24%      0.37s 31.90%  sync.(*Mutex).Lock
     0.12s 10.34% 77.59%      1.03s 88.79%  math/rand.(*lockedSource).Int63
     0.08s  6.90% 84.48%      0.08s  6.90%  math/rand.(*rngSource).Uint64 (inline)
     0.06s  5.17% 89.66%      1.11s 95.69%  math/rand.Int63
     0.05s  4.31% 93.97%      0.13s 11.21%  math/rand.(*rngSource).Int63
     0.04s  3.45% 97.41%      1.15s 99.14%  benchtest.BenchmarkRand
     0.02s  1.72% 99.14%      1.05s 90.52%  math/rand.(*Rand).Int63
     0.01s  0.86%   100%      0.01s  0.86%  runtime.futex
         0     0%   100%      0.01s  0.86%  runtime.allocm

The bottleneck in this case is the mutex, caused by the default source in math/rand being synchronized.

Other profile presentations and output formats are also possible, e.g. tree. Type help for more options.

Note, that any initialization code before the benchmark loop will also be profiled.

huangapple
  • 本文由 发表于 2014年4月14日 05:05:53
  • 转载请务必保留本文链接:https://go.coder-hub.com/23048455.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定