混乱(和错误?)的Go分析器输出

huangapple go评论74阅读模式
英文:

Confusing (and incorrect?) output from Go profiler

问题

我有一个Go二进制文件,我正在尝试对其进行性能分析,并且我得到了令人惊讶的结果。代码在main.go中有以下(截断的)部分,其余的代码在monte包中:

package main

import (
  "monte"
  "runtime/pprof"
)

var cpuprofile = flag.String("cpuprofile", "", "write cpu profile to file")

func main() {
  flag.Parse()
  if *cpuprofile != "" {
    f, err := os.Create(*cpuprofile)
    if err != nil {
      log.Fatal(err)
    }
    pprof.StartCPUProfile(f)
  }

  monte.ExpensiveOperation()

  pprof.StopCPUProfile()
}

我使用go build src/main.go构建可执行文件,然后使用./main -cpuprofile=monte.prof运行它。当我使用go tool pprof main monte.prof检查输出时,我得到以下输出:

(pprof) top10 --cum
Total: 346 samples
     280  80.9%  80.9%      343  99.1% time.Time.Format
       0   0.0%  80.9%      341  98.6% runtime.interhash
       0   0.0%  80.9%      341  98.6% runtime.main
       0   0.0%  80.9%      251  72.5% strconv.Unquote
      13   3.8%  84.7%       31   9.0% strconv.roundShortest
      11   3.2%  87.9%       18   5.2% strconv.fmtE
       9   2.6%  90.5%        9   2.6% runtime.markallocated
       1   0.3%  90.8%        8   2.3% math/rand.Float64
       2   0.6%  91.3%        8   2.3% runtime.FixAlloc_Free
       7   2.0%  93.4%        8   2.3% time.nextStdChunk

具有最大累积时间的函数是time.Time.Format,这对我来说似乎是错误的(应该是main吗?),而且根本没有提到monte,尽管“昂贵的操作”需要大约10秒才能完成,足够的时间供采样器观察到它。如果我向go tool pprof传递--focus=monte标志,根本不会显示任何样本。我想我可能在某个地方漏掉了一些标志;有人有任何想法吗?谢谢!

英文:

I have a Go binary that I'm trying to profile, and I'm getting surprising results. The code has the following (truncated) in main.go, and the rest of the code is in the package monte:

package main

import (
  "monte"
  "runtime/pprof"
)

var cpuprofile = flag.String("cpuprofile", "", "write cpu profile to file")

func main() {
  flag.Parse()
  if *cpuprofile != "" {
    f, err := os.Create(*cpuprofile)
    if err != nil {
      log.Fatal(err)
    }
    pprof.StartCPUProfile(f)
  }

  monte.ExpensiveOperation()

  pprof.StopCPUProfile()
}

I build my executable with go build src/main.go, then run it with ./main -cpuprofile=monte.prof. When I examine the output with go tool pprof main monte.prof, I get the following output:

(pprof) top10 --cum
Total: 346 samples
     280  80.9%  80.9%      343  99.1% time.Time.Format
       0   0.0%  80.9%      341  98.6% runtime.interhash
       0   0.0%  80.9%      341  98.6% runtime.main
       0   0.0%  80.9%      251  72.5% strconv.Unquote
      13   3.8%  84.7%       31   9.0% strconv.roundShortest
      11   3.2%  87.9%       18   5.2% strconv.fmtE
       9   2.6%  90.5%        9   2.6% runtime.markallocated
       1   0.3%  90.8%        8   2.3% math/rand.Float64
       2   0.6%  91.3%        8   2.3% runtime.FixAlloc_Free
       7   2.0%  93.4%        8   2.3% time.nextStdChunk

The function that has the largest cumulative time is time.Time.Format, which seems wrong to me (shouldn't it be main?) and there's no mention of monte at all, despite the fact that the "expensive operation" takes about 10 seconds to complete, plenty of time for the sampler to see it. If I pass the --focus=monte flag to go tool pprof, no samples are shown at all. I assume I'm missing some flag somewhere; does anyone have any ideas? Thanks!

答案1

得分: 2

你的main包无法编译,并且你没有提供monte包的源代码。因此,我们无法重现你的结果,这是调试的第一步。

这里有一些可工作的源代码和结果。

package monte

func ExpensiveOperation() {
    var t float64
    for i := int64(0); i < 1e10; i++ {
        t += 1
    }
}
package main

import (
    "flag"
    "log"
    "monte"
    "os"
    "runtime/pprof"
)

var cpuprofile = flag.String("cpuprofile", "", "write cpu profile to file")

func main() {
    flag.Parse()
    if *cpuprofile != "" {
        f, err := os.Create(*cpuprofile)
        if err != nil {
            log.Fatal(err)
        }
        pprof.StartCPUProfile(f)
    }

    monte.ExpensiveOperation()

    pprof.StopCPUProfile()
}

(pprof) top10 --cum
Total: 1166 samples
0 0.0% 0.0% 1166 100.0% main.main
1166 100.0% 100.0% 1166 100.0% monte.ExpensiveOperation
0 0.0% 100.0% 1166 100.0% runtime.main
0 0.0% 100.0% 1166 100.0% schedunlock

更新:

github.com/haldean/monte上的代码没有提供有意义的结果。它只花费了0m0.506s的时间,并且只有48个样本。

(pprof) top10 --cum
Total: 48 samples
0 0.0% 0.0% 45 93.8% main.main
0 0.0% 0.0% 45 93.8% monte.(*Scene).Evaluate
0 0.0% 0.0% 45 93.8% monte.(*Scene).Render
0 0.0% 0.0% 45 93.8% monte.(*Scene).SetColor
0 0.0% 0.0% 45 93.8% runtime.main
0 0.0% 0.0% 45 93.8% schedunlock
0 0.0% 0.0% 44 91.7% monte.(*Scene).RayCast
4 8.3% 8.3% 31 64.6% runtime.new
13 27.1% 35.4% 27 56.2% runtime.mallocgc
3 6.2% 41.7% 26 54.2% monte.(*Scene).DirectionAt

英文:

Your main package doesn't compile and you don't provide source code for your monte package. Therefore, we can't reproduce your results. which is the first step in debugging.

Here's some working source code and results.

package monte

func ExpensiveOperation() {
	var t float64
	for i := int64(0); i &lt; 1e10; i++ {
		t += 1
	}
}

.

package main

import (
	&quot;flag&quot;
	&quot;log&quot;
	&quot;monte&quot;
	&quot;os&quot;
	&quot;runtime/pprof&quot;
)

var cpuprofile = flag.String(&quot;cpuprofile&quot;, &quot;&quot;, &quot;write cpu profile to file&quot;)

func main() {
	flag.Parse()
	if *cpuprofile != &quot;&quot; {
		f, err := os.Create(*cpuprofile)
		if err != nil {
			log.Fatal(err)
		}
		pprof.StartCPUProfile(f)
	}

	monte.ExpensiveOperation()

	pprof.StopCPUProfile()
}

.

(pprof) top10 --cum
Total: 1166 samples
       0   0.0%   0.0%     1166 100.0% main.main
    1166 100.0% 100.0%     1166 100.0% monte.ExpensiveOperation
       0   0.0% 100.0%     1166 100.0% runtime.main
       0   0.0% 100.0%     1166 100.0% schedunlock

UPDATE:

The code at github.com/haldean/monte doesn't provide meaningful results. It only takes time real 0m0.506s and only takes 48 samples.

(pprof) top10 --cum
Total: 48 samples
       0   0.0%   0.0%       45  93.8% main.main
       0   0.0%   0.0%       45  93.8% monte.(*Scene).Evaluate
       0   0.0%   0.0%       45  93.8% monte.(*Scene).Render
       0   0.0%   0.0%       45  93.8% monte.(*Scene).SetColor
       0   0.0%   0.0%       45  93.8% runtime.main
       0   0.0%   0.0%       45  93.8% schedunlock
       0   0.0%   0.0%       44  91.7% monte.(*Scene).RayCast
       4   8.3%   8.3%       31  64.6% runtime.new
      13  27.1%  35.4%       27  56.2% runtime.mallocgc
       3   6.2%  41.7%       26  54.2% monte.(*Scene).DirectionAt

答案2

得分: 2

看起来像是一个仅支持CPU的分析器,所以如果你的ExpensiveOperation通过执行I/O操作、休眠或类似的操作来消耗时间,它将是不可见的。(这就是“cpu”分析器的问题。)

就数字的含义而言,看起来总共有346个样本。
这些数字有点模糊并不太令人惊讶,这取决于分析器的工作方式,但如果它是一个真正的堆栈采样器,那么这些数字的含义将是:

341/346个样本在堆栈上有maininterhash。你期望所有样本在堆栈上都有main,但这是模糊的部分。

343/346个样本在堆栈上有Format。(为什么有更多的Format而不是main,谁知道呢?)

251/346个样本在堆栈上有Unquote。所以在这251个样本中,它们可能还有maininterhashFormat在堆栈上。

通过这种侦探工作,你可以慢慢开始拼凑出样本所告诉你的信息。

当然,如果你能够实际看到堆栈样本,你在看到很少的样本之前就会知道确切发生了什么。

更多相关信息

英文:

It looks like a CPU-only profiler, so if your ExpensiveOperation takes time by doing I/O or sleep or something like that, it will be invisible.
(That's the problem with "cpu" profilers.)

In terms of what the numbers mean, it looks like there were 346 total samples.
It's not too surprising that the numbers are a bit squishy, depending on how the profiler works, but if it were a true stack sampler the numbers would mean this:

341/346 samples had main and interhash on the stack. You would expect all samples to have main on the stack, but that's the squishy part.

343/346 samples had Format on the stack. (Why there's more of those than main, who knows?)

251/346 samples had Unquote on the stack. So of those 251 samples, they also probably had main, interhash, and Format on the stack.

With this kind of detective work, you can slowly begin to piece together what the samples are telling you.

Of course, if you could actually see the stack samples, you wouldn't have to see very many of them before you would know exactly what's going on.

More about all that.

huangapple
  • 本文由 发表于 2012年10月8日 11:15:19
  • 转载请务必保留本文链接:https://go.coder-hub.com/12774861.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定