英文:
async profiler displays wrong stacktraces
问题
我正在尝试使用 async profiler 版本为 1.8.1 来对我的应用进行性能分析。
JVM 版本为 openjdk "15-ea" 2020-09-15,但是使用 openjdk 14.0.1 也出现了相同的情况。
async profiler 被运行时使用了以下标志:-d 60 -t -i 10000 -o svg
结果显示非常奇怪。 我们可以看到硬件线程花费大部分时间不在 java.lang.Thread::run 方法中,而是在一些奇怪的地方。如何解释这种情况?我能想到的一个可能的解释是 async profiler 未能正确遍历堆栈跟踪,并将部分堆栈跟踪错误地放在了其他位置。还有其他的解释吗?如何修复这个问题?
英文:
I am trying to use async profiler v.1.8.1 for profiling my application.
JVM is openjdk version "15-ea" 2020-09-15, but same happened with openjdk 14.0.1 as well.
async profiler is run with flags: -d 60 -t -i 10000 -o svg
It shows extremely strange results. We can see that hardware thread spends most of the time not in java.lang.Thread::run, but in some weird place. How can it be explained? I see a possible explanation that async profiler fails to properly traverse stack traces and puts part of these stack traces in a wrong place. Is there another explanation? How can it be fixed?
答案1
得分: 3
我假设你想知道在配置文件中的 clock_gettime
下面为什么没有 Java 堆栈。
如你所见,堆栈以 [unknown_Java]
帧结束。这意味着线程确实在运行某些 Java 代码,但是异步分析器无法获取到 Java 堆栈跟踪,因为 JVM 在寻找顶层 Java 帧时出现了问题。
这是因为 System.nanoTime()
和 System.currentTimeMillis()
是 JVM 内置函数。它们会被即时编译为直接调用对应的 C 函数,而不会在调用时从 in_java
切换到 in_native
状态。这意味着 JVM 在调用 nanoTime
或 currentTimeMillis
时不会保存指向最后一个 Java 帧的指针,因此在异步堆栈遍历过程中无法正确识别最后一个 Java 帧。
不幸的是,异步分析器对此无能为力。一个可能的解决方法是禁用相应的 JVM 内置函数:
java -XX:+UnlockDiagnosticVMOptions -XX:DisableIntrinsic=_currentTimeMillis,_nanoTime
另外,我在你的火焰图中发现奇怪的一点是 clock_gettime
调用了内核。通常情况下不应该如此,因为 clock_gettime
在 vDSO 中实现,它映射到进程的用户空间。原因可能是错误的时钟源或禁用了 vDSO(1,2)。
英文:
I assume you wonder why there are no Java frames underneath clock_gettime
in the profile.
As you can see, the stack ends with [unknown_Java]
frame. This means, the thread indeed runs some Java code, but async-profiler cannot get Java stack trace, because the JVM fails to find the top Java frame.
This happens because System.nanoTime()
and System.currentTimeMillis()
are JVM intrinsics. They are JIT-compiled as a direct call of a corresponding C function without switching thread from in_java
to in_native
state. This means, the JVM does not save a pointer to the last Java frame when calling nanoTime
or currentTimeMillis
, and thus has problems discovering the last Java frame during asynchronous stack walking.
Unfortunately, async-profiler cannot do much about it. One possible workaround is to disable the corresponding JVM intrinsics:
java -XX:+UnlockDiagnosticVMOptions -XX:DisableIntrinsic=_currentTimeMillis,_nanoTime
BTW, what I find strange in your flame graph is that clock_gettime
calls into the kernel. Typically it should not, since clock_gettime
is implemented in vDSO which is mapped into the process' user space. The reason could be a wrong clock source / disabled vDSO (1, 2).
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论