“co_spawn”的成本和替代方案

huangapple go评论74阅读模式
英文:

co_spawn cost and alternatives

问题

如果我有一个在关键执行路径中调用协程的常规方法,在使用co_spawn时可能会引入潜在的延迟。

当我使用co_spawn时,它会安排协程与其余代码并发运行,这意味着它不会阻塞调用方法的执行。然而,仍然存在一些涉及安排和管理协程的开销,这可能会影响应用程序的整体延迟。

是否有一种更有效的方式可以从常规函数中调用协程?

英文:

If I have a regular method that calls a coroutine in a critical execution path, using co_spawn can potentially introduce latency.

When I use co_spawn, it schedules the coroutine to run concurrently with the rest of the code, which means it doesn't block the execution of the calling method. However, there is still some overhead involved in scheduling and managing the coroutine, which can impact the overall latency of the application.

Is there a more efficient way to call a coroutine from a regular function?

答案1

得分: 1

当我使用co_spawn时,它会安排协程与其余的代码并发运行,这意味着它不会阻塞调用方法的执行。然而,仍然存在一些与安排和管理协程相关的开销。

这未必是正确的。您正在混淆并发性与异步性。

在ASIO的上下文中,异步操作无需安排。相反,它们可以将工作委托给内核或自然是异步的硬件。唯一涉及“安排”的元素是完成的调用。事实上,如果您的IO操作花费的时间无限小,那么回调调用将占主导地位,影响观察到的挂钟时间。

然而,通常情况下,IO操作相对于CPU导向的负载来说是昂贵的。您可以在TCP往返的时间内启动和完成许多任务(即使在环回网络上也是如此)。这就是为什么异步IO框架如此受欢迎的原因。这也是为什么Windows实现了IO完成端口等等。随着时间的推移,出现了许多创新,因为它们“有意义”。

有没有更有效的方法从常规函数中调用协程?

是的。原则上,直接在可用的操作系统原语上编写自己的状态机在理论上是最快的。然而,这将是乏味的、依赖于平台的,并且与其他代码集成非常困难。这正是Asio巧妙填补的空白。

为了将开销降到最低:

  • 放弃线程(https://stackoverflow.com/a/72292313/85371)
  • 放弃类型擦除的执行器(使用例如co_await_t<io_context::executor_type>,而不是使用默认的any_io_executor进行参数化)

该库进行了与本地线程上排队的完成调度相关的聪明优化,管理分配顺序以最大程度地提高重用并最小化碎片化。请关注最新版本中引入的“立即完成”优化功能

例如,当您执行以下操作时:

Live On Compiler Explorer

#include <boost/asio.hpp>
namespace asio = boost::asio;

using Ex = asio::io_context::executor_type;

asio::awaitable<int, Ex> static inline answer(std::string_view prompt) {
    co_return prompt.length() + 9;
}

int main() {
    asio::io_context ioc(1);
    co_spawn(ioc.get_executor(), answer("Life, the Universe and Everything"),
             [](std::exception_ptr, int i) { ::exit(i); });
    ioc.run();
}

您将看到程序返回42,而没有任何调度开销。我看到的最有形的开销是分配协程帧,如果您需要协程的话,通常是需要的(否则,您可以仅将处理程序提交到队列或IO上下文中)。

有没有更有效的方法从常规函数中调用协程?

协程可以非常轻量级。它们变得多轻量化主要取决于您的编译器优化它们的能力。这又将主要取决于等待类型(promise/handle)的复杂性。原则上,可以通过降低成本来实现,但代价是削减功能。Asio的awaitable是为异步IO场景设计的,这是显而易见的原因。如果您不想/不需要这样做,可以考虑查看更低级别或通用的库,例如cppcoro。当然,如果您需要的话,您将需要自行将它们集成到您的应用程序的异步IO需求中。

英文:

> When I use co_spawn, it schedules the coroutine to run concurrently with the rest of the code, which means it doesn't block the execution of the calling method. However, there is still some overhead involved in scheduling and managing the coroutine

This is not necessarily true. You're conflating concurrency with asynchrony.

In the context of ASIO asynchronous operations need not be scheduled. Instead they may delegate work to the kernel or, indeed, hardware which is naturally asynchronous. The only element of "scheduling" there is the invocation of the completion. Indeed, if your IO operations take infinitely small time then the callback invocation will dominate the wall-clock timing observed.

However, usually, IO operations are decidedly costly, relative to e.g. CPU oriented load. You could start and many tasks in the span of a TCP roundtrip (even on the loop-back network). This is why asynchronous IO frameworks are popular. It's also why Windows implement IO completion ports etc. All the many innovations over time are there, because they make sense.

> Is there a more efficient way to call a coroutine from a regular function?

Yes. In principle writing your own state machines directly on OS primitives available is theoretically fastest. However it will be tedious, platform-dependent and VeryHard(TM) to integrate with other code. This is the hole that Asio neatly plugs.

To reduce overhead to the minimum:

  • opt out of threading (https://stackoverflow.com/a/72292313/85371)
  • opt out of type erased executors (use e.g. co_await_t<io_context::executor_type> instead of parameterized by the default any_io_executor)

The library does clever optimizations relating to (but not limited to) dispatching completions queued on the local thread, managing allocation order to maximize reuse and minimize fragmentation. Keep an eye on the immediate completion optimization feature that landed in the latest releases.

E.g. when you do

Live On Compiler Explorer

#include <boost/asio.hpp>
namespace asio = boost::asio;

using Ex = asio::io_context::executor_type;

asio::awaitable<int, Ex> static inline answer(std::string_view prompt) {  //
    co_return prompt.length() + 9;
}

int main() {
    asio::io_context ioc(1);
    co_spawn(ioc.get_executor(), answer("Life, the Universe and Everything"),
             [](std::exception_ptr, int i) { ::exit(i); });
    ioc.run();
}

You see the program returning 42 without any scheduling overhead. The most tangible overhead I see is from allocating the coro frame, which you would usually need if you needed a coro anyways (you could just be submitting handlers to a queue - or the io context - otherwise).

> Is there a more efficient way to call a coroutine from a regular function?

Coroutines can be very lightweight. Just how lightweight they become depends on your compiler's ability to optimize them. That, in turn, will depend mostly on the complexity of the awaitable's types (promise/handle). In principle it will be very possible to reduce the cost, at the cost of cutting functionality. Asio's awaitable is designed for asynchronous IO scenarios, for obvious reasons. If you don't want/need that, instead look at more low-level or general-purpose libraries, like perhaps cppcoro. Of course, you will be on your own integrating them into your application's asynchronous IO needs if you have them.

huangapple
  • 本文由 发表于 2023年6月30日 00:09:08
  • 转载请务必保留本文链接:https://go.coder-hub.com/76582824.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定