英文:
Why is `std::this_thread::yield()` 10x slower than `std::this_thread::sleep_for(0s)`?
问题
只是测试这两个小程序,
#include <thread>
int main()
{
for (int i = 0; i < 10000000; i++)
{
std::this_thread::yield();
}
return 0;
}
和:
#include <thread>
#include <chrono>
int main()
{
using namespace std::literals;
for (int i = 0; i < 10000000; i++)
{
std::this_thread::sleep_for(0s);
}
return 0;
}
我在我的系统上(Ubuntu 22.04 LTS,内核版本5.19.0-43-generic)得到了相应的时间:
./a.out 0.33s 用户 1.36s 系统 99% CPU 1.687 总时间
和:
./a.out 0.14s 用户 0.00s 系统 99% CPU 0.148 总时间
为什么 std::this_thread::yield()
比 std::this_thread::sleep_for(0s)
慢10倍?
注:在g++和clang++之间的计时类似。
编辑:正如答案中指出的,这是STL实现的一种优化,调用 sleep(0)
实际上慢了300倍(50微秒对比150纳秒)。
英文:
Just testing the two small programs,
<!-- language: c++ -->
#include <thread>
int main()
{
for (int i = 0; i < 10000000; i++)
{
std::this_thread::yield();
}
return 0;
}
and:
<!-- language: c++ -->
#include <thread>
#include <chrono>
int main()
{
using namespace std::literals;
for (int i = 0; i < 10000000; i++)
{
std::this_thread::sleep_for(0s);
}
return 0;
}
I get the respective timings on my system (Ubuntu 22.04 LTS, kernel version 5.19.0-43-generic),
./a.out 0,33s user 1,36s system 99% cpu 1,687 total
and:
./a.out 0,14s user 0,00s system 99% cpu 0,148 total
Why is std::this_thread::yield()
10x slower than std::this_thread::sleep_for(0s)
?
N.B. Timing is similar between g++ and clang++.
edit: As pointed out in the answer this is an optimization of the STL implementation, calling sleep(0)
is in fact 300x slower (50us vs 150ns).
答案1
得分: 8
快速查看this_thread::sleep_for
的源代码
template<typename _Rep, typename _Period>
inline void
sleep_for(const chrono::duration<_Rep, _Period>& __rtime)
{
if (__rtime <= __rtime.zero())
return;
...
因此,sleep_for(0s)
什么也不做,实际上,您的测试程序使用了0.0秒的系统时间,基本上是在用户空间内完全运行的空循环(实际上,我怀疑如果您使用优化进行编译,它将被完全删除)
另一方面,yield
调用*sched_yield
,然后将在内核空间中调用schedule()
,因此至少会执行一些逻辑以检查是否有另一个要调度的线程。
我认为您0.33秒的用户空间时间基本上是系统调用的开销。
*实际上跳转到__libcpp_thread_yield
,然后调用sched_yield
,至少在Linux上是这样
英文:
Taking a quick look at the source for this_thread::sleep_for
template<typename _Rep, typename _Period>
inline void
sleep_for(const chrono::duration<_Rep, _Period>& __rtime)
{
if (__rtime <= __rtime.zero())
return;
...
So sleep_for(0s)
does nothing, in fact your test program uses 0.0s of system time, basically an empty loop that runs entirely in user space (in fact I suspect that if you compile with optimizations, it will be completely removed)
On the other hand, yield
calls<sup>*</sup> sched_yield
which in turns will call schedule()
in kernel space, thus at least executing some logic to check if there is another thread to schedule.
I believe that your 0.33s of user space time is basically syscall overhead.
<sup>*<sub> Actually jumps to __libcpp_thread_yield
which then calls sched_yield
, at least on linux</sub></sup>
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论