英文:
How to visualize profile files graphically?
问题
我正在开发适用于 Windows 8.1 64 位的 Go 1.2。在使用 go pprof 工具时,我遇到了许多问题,例如显示内存地址而不是实际函数名称。
然而,我找到了一个名为 profile 的工具,它似乎能够很好地生成与 pprof 工具兼容的配置文件。我的问题是,我如何使用这些配置文件进行图形化可视化呢?
英文:
I'm developing Go 1.2 on Windows 8.1 64 bit. I had many issues getting the go pprof tool to work properly such as memory addresses being displayed instead of actual function names.
However, i found profile which seems to do a great job at producing profile files, which work with the pprof tool. My guestion is, how do i use those profile files for graphical visualization?
答案1
得分: 2
你可以尝试使用go tool pprof /path/to/program profile.prof
来解决函数不正确的问题。
如果你想要图形化展示,可以在pprof中输入web
。
英文:
U can try go tool pprof /path/to/program profile.prof
to solve function not true problem.
if u want graphical visualization, try input web
in pprof.
答案2
得分: 1
如果你的目标是看到漂亮但基本上没有意义的图片,那就按照@Specode建议的去进行可视化。
如果你的目标是速度,那我建议你忘记可视化。
可视化不能告诉你需要修复什么。
这种方法确实告诉你需要修复什么。
你可以在GDB中很有效地做到这一点。
针对@BurntSushi5的回应:
以下是我对图形的抱怨
首先,它们非常容易被欺骗。
例如,假设A1花费了所有的时间调用C2,反之亦然。
然后假设插入了一个新的例程B,当A1调用B时,B调用C2,当A2调用B时,B调用C1。
图形丢失了每次调用C2时A1在栈上方的信息,反之亦然。
再举一个例子,假设对C的每次调用都来自A。
然后假设A“分派”给一堆函数B1、B2、...,每个函数都调用C。
图形丢失了每次调用C都经过A的信息。
现在来看一下链接的图形:
-
它非常强调自身时间,制作了巨大的方框,而包含时间更重要。(事实上,
gprof
的发明就是因为自身时间几乎没有用处,就像只有秒针的时钟一样。)他们至少可以按包含时间来缩放方框。 -
它没有提供调用所在的代码行或花费自身时间的代码行的信息。它基于一个假设,即所有函数都应该很小。也许这是真的,也许不是,但这足够好的理由让性能分析输出变得无用吗?
-
它充斥着一堆无关紧要的小方框,它们的时间微不足道。它们只是占用了大量的空间并分散了你的注意力。
-
没有关于I/O的信息。图形所来自的性能分析器显然认为只有必要的I/O才是必要的,所以没有必要对其进行性能分析(即使它占据了90%的时间)。在大型程序中,很容易进行不必要的I/O操作,占用大部分时间,所谓的“CPU分析器”对此有偏见,认为它根本不存在。
-
图形中似乎没有递归的实例,但递归是常见且有用的,图形很难用有意义的测量显示它。
只是指出,如果只取少量的堆栈样本,大约一半的样本会是这样的:
blah-blah-couldn't-read-it
blah-blah-couldn't-read-it
blah-blah-couldn't-read-it
fragbag.(*structureAtoms).BestStructureFragment
structure.RMSDMem
... 还有几个例程
另一半的样本在做其他事情,同样具有信息量。由于每个堆栈样本都显示了调用来自的代码行,实际上告诉你为什么花费了时间。
(那些不花费太多时间的活动很少被采样到,这是好事,因为你不关心那些。)
现在我不了解这段代码,但是这个图形让我强烈怀疑,就像我看到的很多代码一样,问题在于数据结构。
英文:
If your goal is to see pretty but basically meaningless pictures, go for visualization as @Specode suggested.
If your goal is speed, then I recommend you forget visualization.
Visualization does not tell you what you need to fix.
This method does tell you what to fix.
You can do it quite effectively in GDB.
EDIT in response to @BurntSushi5:
Here are my "gripes with graphs"
In the first place, they are super easy to fool.
For example, suppose A1 spends all its time calling C2, and vice-versa.
Then suppose a new routine B is inserted, such that when A1 calls B, B calls C2, and when A2 calls B, B calls C1.
The graph loses the information that every time C2 is called, A1 is above it on the stack, and vice-versa.
For another example, suppose every call to C is from A.
Then suppose instead A "dispatches" to a bunch of functions B1, B2, ..., each of which calls C.
The graph loses the information that every call to C comes through A.
Now to the graph that was linked:
-
It places great emphasis on self time, making giant boxes, when inclusive time is far more important. (In fact, the whole reason
gprof
was invented was because self time was about as useful as a clock with only a second-hand.) They could at least have scaled the boxes by inclusive time. -
It says nothing about the lines of code that the calls come from, or that are spending the self time. It's based on the assumption that all functions should be small. Maybe that's true, and maybe not, but is it a good enough reason for the profile output to be unhelpful?
-
It is chock-full of little boxes that don't matter because their time is insignificant. All they do is take up gobs of real estate and distract you.
-
There's nothing in there about I/O. The profiler from which the graph came apparently embodies that the only I/O is necessary I/O, so there's no need to profile it (even if it takes 90% of the time). In big programs, it's really easy for I/O to be done that isn't really necessary, taking a big fraction of time, and so-called "CPU profilers" have the prejudice that it doesn't even exist.
-
There doesn't seem to be any instance of recursion in that graph, but recursion is common, and useful, and graphs have difficulty displaying it with meaningful measurements.
Just pointing out that, if a small number of stack samples are taken, roughly half of them would look like this:
blah-blah-couldn't-read-it
blah-blah-couldn't-read-it
blah-blah-couldn't-read-it
fragbag.(*structureAtoms).BestStructureFragment
structure.RMSDMem
... a couple other routines
The other half of the samples are doing something else, equally informative.
Since each stack sample shows you the lines of code where the calls come from, you're actually being told why the time is being spent.
(Activities that don't take much time have very small likelihood of being sampled, which is good, because you don't care about those.)
Now I don't know this code, but the graph gives me a strong suspicion that, like a lot of code I see, the devil's in the data structure.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论