英文:
When busy-spining Java thread is bound to physical core, can context switch happen by the reason that new branch in code is reached?
问题
我对低延迟代码很感兴趣,这就是为什么我尝试配置线程亲和性。特别地,这应该有助于避免上下文切换。
我使用 https://github.com/OpenHFT/Java-Thread-Affinity 进行了线程亲和性的配置。我运行了一个非常简单的测试代码,只是在一个循环中旋转,检查一个时间条件。
long now = start;
while (true)
{
if (now < start + TimeUtils.NANOS_IN_SECOND * delay)
{
now = TimeUtils.now();
}
else
{
// 将在30秒后打印
if (TimeUtils.now() > start + TimeUtils.NANOS_IN_SECOND * (delay + 30))
{
final long finalNow = now;
System.out.println("时间已过,时间为:" +
TimeUtils.toInstant(finalNow) + " 现在:" +
TimeUtils.toInstant(TimeUtils.now()));
System.exit(0);
}
}
}
所以,在指定的延迟之后,执行进入“else”分支,大约在同一时间我看到上下文切换。这是预期的行为吗?这是什么具体原因?在这种情况下如何避免上下文切换?
测试详情
我从这个仓库构建了 shadowJar:https://github.com/stepan2271/thread-affinity-example。然后我使用以下命令运行它(可以在这里尝试不同的数字,当延迟大于60时,对测试影响不大):
taskset -c 19 java -DtestLoopBindingCpu=3 -Ddelay=74 -cp demo-all.jar main.TestLoop
我还有以下的测试脚本来监控上下文切换(应该使用绑定到核心的 Java 线程的 ID 运行):
#!/bin/bash
while [ true ]
do
date >> ~/demo-ctxt-switches.log
cat /proc/$1/status | grep ctxt >> ~/demo-ctxt-switches.log
sleep 3
done
此脚本的典型输出如下:
Fri Oct 16 18:23:29 MSK 2020
voluntary_ctxt_switches: 90
nonvoluntary_ctxt_switches: 37
Fri Oct 16 18:23:32 MSK 2020
voluntary_ctxt_switches: 90
nonvoluntary_ctxt_switches: 37
Fri Oct 16 18:23:35 MSK 2020
voluntary_ctxt_switches: 90
nonvoluntary_ctxt_switches: 37
Fri Oct 16 18:23:38 MSK 2020
voluntary_ctxt_switches: 90
nonvoluntary_ctxt_switches: 37
Fri Oct 16 18:23:41 MSK 2020
voluntary_ctxt_switches: 91
nonvoluntary_ctxt_switches: 37
Fri Oct 16 18:23:44 MSK 2020
voluntary_ctxt_switches: 91
nonvoluntary_ctxt_switches: 37
Fri Oct 16 18:23:47 MSK 2020
voluntary_ctxt_switches: 91
nonvoluntary_ctxt_switches: 37
所以,在一些开始时间的变化之后,这些数字变得稳定,然后在代码到达“else”分支时,我会在相同时间内看到1到3次切换(差异小于1秒)。
偏差
基本配置几乎每次都会重现这种行为,而一些偏差会导致我无法重现。示例:
https://github.com/stepan2271/thread-affinity-example/tree/without-log4j
https://github.com/stepan2271/thread-affinity-example/tree/without-cached-nano-clock
测试环境
2 * Intel(R) Xeon(R) Gold 6244 CPU @ 3.60GHz
Red Hat Enterprise Linux 8.1 (Ootpa)
核心使用 /etc/systemd/system.conf 和 /etc/systemd/user.conf 中的 CPUAffinity 进行了隔离。
已配置 /etc/sysconfig/irqbalance。
Openjdk 11.0.6 2020-01-14 LTS 运行时环境 18.9
英文:
I am interested in low-latency code and that`s why I tried to configure thread affinity. In particular, it was supposed to help to avoid context switches.
I have configured thread affinity using https://github.com/OpenHFT/Java-Thread-Affinity. I run very simple test code that just spins in a cycle checking a time condition.
long now = start;
while (true)
{
if (now < start + TimeUtils.NANOS_IN_SECOND * delay)
{
now = TimeUtils.now();
}
else
{
// Will be printed after 30 sec
if (TimeUtils.now() > start + TimeUtils.NANOS_IN_SECOND * (delay + 30))
{
final long finalNow = now;
System.out.println("Time is over at " +
TimeUtils.toInstant(finalNow) + " now: " +
TimeUtils.toInstant(TimeUtils.now()));
System.exit(0);
}
}
}
So, after the specified delay execution goes to "else" and approximately at the same time I see context switch. Is it an expected behavior? What is the specific reason for this? How can I avoid context switches in such cases?
Test Details
I build shadowJar from this repo: https://github.com/stepan2271/thread-affinity-example. Then I run it using the following command (one can play around with numbers here, it doesn`t have significant effect on test when delay > 60):
taskset -c 19 java -DtestLoopBindingCpu=3 -Ddelay=74 -cp demo-all.jar main.TestLoop
I also have the following test script to monitor context switches (should be run with ID of the Java thread that is bound to core)
#!/bin/bash
while [ true ]
do
date >> ~/demo-ctxt-switches.log
cat /proc/$1/status | grep ctxt >> ~/demo-ctxt-switches.log
sleep 3
done
Typical output of this script is the following:
Fri Oct 16 18:23:29 MSK 2020
voluntary_ctxt_switches: 90
nonvoluntary_ctxt_switches: 37
Fri Oct 16 18:23:32 MSK 2020
voluntary_ctxt_switches: 90
nonvoluntary_ctxt_switches: 37
Fri Oct 16 18:23:35 MSK 2020
voluntary_ctxt_switches: 90
nonvoluntary_ctxt_switches: 37
Fri Oct 16 18:23:38 MSK 2020
voluntary_ctxt_switches: 90
nonvoluntary_ctxt_switches: 37
Fri Oct 16 18:23:41 MSK 2020
voluntary_ctxt_switches: 91
nonvoluntary_ctxt_switches: 37
Fri Oct 16 18:23:44 MSK 2020
voluntary_ctxt_switches: 91
nonvoluntary_ctxt_switches: 37
Fri Oct 16 18:23:47 MSK 2020
voluntary_ctxt_switches: 91
nonvoluntary_ctxt_switches: 37
So, after some changes in start time these numbers become stable and then I see from 1 to 3 switches exactly (difference is less than 1 sec) at the time when code reaches "else" branch.
Deviations
Base configuration reproduces this behavior almost each time, while some deviations lead to situation when I didn`t manage to reproduce. Examples:
https://github.com/stepan2271/thread-affinity-example/tree/without-log4j
https://github.com/stepan2271/thread-affinity-example/tree/without-cached-nano-clock
Test environment
2 * Intel(R) Xeon(R) Gold 6244 CPU @ 3.60GHz
Red Hat Enterprise Linux 8.1 (Ootpa)
Cores are isolated using CPUAffinity in /etc/systemd/system.conf and /etc/systemd/user.conf
/etc/sysconfig/irqbalance is configured.
Openjdk 11.0.6 2020-01-14 LTS Runtime Environment 18.9
答案1
得分: 28
自愿的上下文切换通常意味着线程正在等待某些事情,例如等待锁变为空闲。
async-profiler 可以帮助查找上下文切换发生的位置。以下是我使用的命令行:
./profiler.sh -d 80 -e context-switches -i 2 -t -f switches.svg -I 'main*' -X 'exit_to_usermode_loop*' PID
让我们详细介绍一下:
-d 80
最多运行 80 秒的分析器。-e context-switches
要进行分析的事件。-i 2
间隔 = 2 个事件。我每秒分析一次上下文切换,因为分析信号本身会导致上下文切换,我不想陷入递归。-t
按线程拆分分析结果。-f switches.svg
输出文件名;svg 扩展名会自动选择 Flame Graph 格式。-I 'main*'
仅包括主线程在输出中。-X 'exit_to_usermode_loop*'
排除与非自愿上下文切换相关的事件。PID
要分析的 Java 进程 ID。
结果可能因每次运行而异。通常在每个图表上我看到 0 到 3 次上下文切换。
以下是上下文切换最常发生的地方。它们确实与在互斥锁上等待相关:
- 从
TestLoop.main
调用ThreadSafepointState::handle_polling_page_exception()
。这意味着一个线程已经在另一个线程请求的安全点处停止。要调查安全点的原因,请添加-Xlog:safepoint*
JVM 选项。
[75.889s][info][safepoint ] Application time: 74.0071000 seconds
[75.889s][info][safepoint ] Entering safepoint region: Cleanup
[75.889s][info][safepoint,cleanup] deflating idle monitors, 0.0000003 secs
[75.889s][info][safepoint,cleanup] updating inline caches, 0.0000058 secs
[75.890s][info][safepoint,cleanup] compilation policy safepoint handler, 0.0000004 secs
[75.890s][info][safepoint,cleanup] purging class loader data graph, 0.0000001 secs
[75.890s][info][safepoint,cleanup] resizing system dictionaries, 0.0000009 secs
[75.890s][info][safepoint,cleanup] safepoint cleanup tasks, 0.0001440 secs
[75.890s][info][safepoint ] Leaving safepoint region
是的,在 JVM 启动后的 74 秒后,发生了清理安全点(Cleanup safepoint)。清理安全点 的目的是运行定期任务;在这种情况下,是为了更新内联缓存。如果有清理工作要做,安全点可能每隔 GuaranteedSafepointInterval
毫秒(默认为 1000 毫秒)发生一次。您可以通过设置 -XX:GuaranteedSafepointInterval=0
来禁用定期安全点,但这可能会有 性能影响。
- 从
TimeUtils.now
调用SharedRuntime::handle_wrong_method()
。当编译代码中的调用点变为非入口时,就会发生这种情况。由于这与 JIT 编译有关,添加-XX:+PrintCompilation
选项。
75032 1430 % 4 main.TestLoop::main @ 149 (245 bytes) made not entrant
75033 1433 % 3 main.TestLoop::main @ 149 (245 bytes)
75033 1434 4 util.RealtimeNanoClock::nanoTime (8 bytes)
75034 1431 3 util.RealtimeNanoClock::nanoTime (8 bytes) made not entrant
75039 1435 % 4 main.TestLoop::main @ 149 (245 bytes)
75043 1433 % 3 main.TestLoop::main @ 149 (245 bytes) made not entrant
是的,TestLoop.main
和 RealtimeNanoClock.nanoTime
在 JVM 启动后的 75 秒后重新编译。要找出原因,请添加 -XX:+UnlockDiagnosticVMOptions -XX:+LogCompilation
。
这将生成一个大的编译日志,我们将在其中查找发生在第 75 秒的事件。
<uncommon_trap thread='173414' reason='unstable_if' action='reinterpret' debug_id='0' compile_id='1232' compile_kind='osr' compiler='c2' level='4' stamp='75.676'>
<jvms bci='161' method='main.TestLoop main ([Ljava/lang/String;)V' bytes='245' count='1' backedge_count='533402' iicount='1'/>
这是一个由于字节码索引 161 处的 unstable_if
导致的 非常规陷阱。换句话说,当 JIT 编译 main
时,HotSpot 没有为 else
分支生成代码,因为之前从未执行过(这是一种推测性的死代码消除)。但是,为了保持编译代码的正确性,HotSpot 在推测条件失败时放置了一个陷阱,以进行去优化并回退到解释器。这正是在您的情况下当 if
条件变为 false
时发生的情况。
- `Runtime1
英文:
A voluntary context switch usually means a thread is waiting for something, e.g. for a lock to become free.
async-profiler can help to find where context switches happen. Here is a command line I used:
./profiler.sh -d 80 -e context-switches -i 2 -t -f switches.svg -I 'main*' -X 'exit_to_usermode_loop*' PID
Let's go through it in detail:
-d 80
run profiler for at most 80 seconds.-e context-switches
an event to profile.-i 2
interval = 2 events. I profile every second context-switch, since the profiling signal itself causes a context switch, and I don't want to fall into recursion.-t
split the profile by threads.-f switches.svg
output file name; svg extension automatically selects Flame Graph format.-I 'main*'
include only the main thread in the output.-X 'exit_to_usermode_loop*'
exclude events related to nonvoluntary context switches.PID
Java process ID to profile.
The results may differ from one run to another. Typically I see from 0 to 3 context switches on each graph.
Here are the most common places where a context switch happens. They are indeed related to waiting on a mutex.
ThreadSafepointState::handle_polling_page_exception()
called fromTestLoop.main
. This means, a thread has been stopped at a safepoint requested by another thread. To investigate a reason of a safepoint, add-Xlog:safepoint*
JVM option.
[75.889s][info][safepoint ] Application time: 74.0071000 seconds
[75.889s][info][safepoint ] Entering safepoint region: Cleanup
[75.889s][info][safepoint,cleanup] deflating idle monitors, 0.0000003 secs
[75.889s][info][safepoint,cleanup] updating inline caches, 0.0000058 secs
[75.890s][info][safepoint,cleanup] compilation policy safepoint handler, 0.0000004 secs
[75.890s][info][safepoint,cleanup] purging class loader data graph, 0.0000001 secs
[75.890s][info][safepoint,cleanup] resizing system dictionaries, 0.0000009 secs
[75.890s][info][safepoint,cleanup] safepoint cleanup tasks, 0.0001440 secs
[75.890s][info][safepoint ] Leaving safepoint region
Right, a Cleanup safepoint happens shortly after 74 seconds (exactly the specified delay).
The purpose of a Cleanup safepoint is to run periodic tasks; in this case - to update inline caches. If there is cleanup work to do, a safepoint may happen every GuaranteedSafepointInterval
milliseconds (1000 by default). You can disable periodic safepoints by setting -XX:GuaranteedSafepointInterval=0
, but this may have performance implications.
SharedRuntime::handle_wrong_method()
fromTimeUtils.now
. This happens when a call site in the compiled code has been made non-entrant. As this is related to JIT compilation, add-XX:+PrintCompilation
option.
75032 1430 % 4 main.TestLoop::main @ 149 (245 bytes) made not entrant
75033 1433 % 3 main.TestLoop::main @ 149 (245 bytes)
75033 1434 4 util.RealtimeNanoClock::nanoTime (8 bytes)
75034 1431 3 util.RealtimeNanoClock::nanoTime (8 bytes) made not entrant
75039 1435 % 4 main.TestLoop::main @ 149 (245 bytes)
75043 1433 % 3 main.TestLoop::main @ 149 (245 bytes) made not entrant
Yes, both TestLoop.main
and RealtimeNanoClock.nanoTime
were recompiled 75 seconds after JVM start. To find out the reason, add -XX:+UnlockDiagnosticVMOptions -XX:+LogCompilation
.
This will produce a large compilation log, where we'll look for an event happened at 75th second.
<uncommon_trap thread='173414' reason='unstable_if' action='reinterpret' debug_id='0' compile_id='1232' compile_kind='osr' compiler='c2' level='4' stamp='75.676'>
<jvms bci='161' method='main.TestLoop main ([Ljava/lang/String;)V' bytes='245' count='1' backedge_count='533402' iicount='1'/>
That was an uncommon trap due to unstable_if
at bytecode index 161. In other words, when main
was JIT compiled, HotSpot did not produce code for the else
branch, because it was never executed before (such a speculative dead code elimination). However, to retain correctness of the compiled code, HotSpot places a trap to deoptimize and fall back to the interpreter, if the speculative condition fails. This is exactly what happens in your case when if
condition becomes false
.
-
Runtime1::counter_overflow()
. This is again related to recompilation. After running C1 compiled code for some time, HotSpot discovers that the code is hot, and decides to recompile it with C2.In this case I caught a contended lock on the compiler queue.
Conclusion
HotSpot JIT compilers heavily rely on speculative optimizations. When a speculative condition fails, this leads to deoptimization. Deoptimization is indeed very bad for low latency applications: besides switching to slow execution in the interpreter, this may indirectly cause undesired pauses due to acquiring locks in the JVM runtime, or bringing the JVM to a safepoint.
Common reasons for deoptimization are unstable_if
and class_check
. If you want to avoid deoptimization on a latency critical path, make sure to "warm-up" all code paths and all possible receivers for virtual methods.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论