executors 何时以及如何将控制权归还给事件循环?

huangapple go评论70阅读模式
英文:

When (and how) do executors yield control back to the event loop?

问题

I am trying to wrap my head around asyncio for the first time, and I think I've got a basic grasp on coroutines & how to await their corresponding objects. Now, I've encountered AbstractEventLoop.run_in_executor, and the concept makes sense in the abstract (no pun intended): you have some blocking operation, so you kick it off to a thread so that the main thread (i.e. the thread running the main event loop) can continue its work.

The thing that I do not understand is how the event loop manages context switches between coroutines (cooperative multitasking), and this newly created thread (preemptive multitasking). As I understand it, every coroutine is awaited, and the act of awaiting that coroutine yields control back to the event loop. This allows the event loop to start running another coroutine. But threads are not cooperative in this manner -- there is some other scheduler (I believe in your OS, but that may be wrong) that schedules when threads run, and threads can be stopped at any point in their execution. Additionally, why do you even call run_in_executor on the event loop? Shouldn't the newly created thread be completely separate from the thread that is running the event loop in order to allow for the concurrency that we're looking for?

My only guess at what could be happening is that since run_in_executor returns a coroutine (which is also a little confusing -- how do you get a coroutine out of a thread?), awaiting this coroutine causes a context switch in the underlying thread, but I really have no idea how you could implement something like that.

英文:

I am trying to wrap my head around asyncio for the first time, and I think I've got a basic grasp on coroutines & how to await their corresponding objects. Now, I've encountered AbstractEventLoop.run_in_executor, and the concept makes sense in the abstract (no pun intended): you have some blocking operation, so you kick it off to a thread so that the main thread (i.e. the thread running the main event loop) can continue its work.

The thing that I do not understand is how the event loop manages context switches between coroutines (cooperative multitasking), and this newly created thread (preemptive multitasking). As I understand it, every coroutine is awaited, and the act of awaiting that coroutine yields control back to the event loop. This allows the event loop to start running another coroutine. But threads are not cooperative in this manner -- there is some other scheduler (I believe in your OS, but that may be wrong) that schedules when threads run, and threads can be stopped at any point in their execution. Additionally, why do you even call run_in_executor on the event loop? Shouldn't the newly created thread be completely separate from the thread that is running the event loop in order to allow for the concurrency that we're looking for?

My only guess at what could be happening is that since run_in_executor returns a coroutine (which is also a little confusing -- how do you get a coroutine out of a thread?), awaiting this coroutine causes a context switch in the underlying thread, but I really have no idea how you could implement something like that.

答案1

得分: 3

你对协程和asyncio的理解是正确的。让我们分解事件循环、协程和线程之间的交互。

run_in_executor 是事件循环提供的方法,用于在单独的线程(或进程)中运行一个函数,并返回一个可等待对象(具体来说,是一个asyncio.Future),它代表函数的最终结果。当你有一个希望从运行事件循环的主线程卸载的阻塞操作时,这非常有用。你是正确的,线程由操作系统调度,并且可以在执行过程中的任何时刻停止,这使它们支持抢占式多任务。

当你在事件循环上调用 run_in_executor 时,事件循环创建一个新的线程(或者使用线程池中的现有线程)来执行你提供的函数。这个新线程确实与运行事件循环的主线程分离。你之所以在事件循环上调用 run_in_executor 是因为它允许事件循环管理并协调阻塞操作的结果与它已经处理的协程。

现在,让我们谈谈 run_in_executor 返回的类似协程的对象。当你await这个对象时,你正在等待的不是实际的线程,而是一个代表在单独线程中运行函数结果的 asyncio.Future 对象。Future 对象是计算最终结果的占位符。通过等待 Future,事件循环可以继续执行其他协程,直到结果准备好。

当你在单独线程中运行的函数完成时,该线程会更新 Future 对象,其中包含结果或执行过程中发生的任何异常。事件循环监视 Future 的状态,看到结果已经准备好,然后安排等待 Future 的协程继续执行。这样,事件循环可以管理协程和线程之间的并发,而无需切换线程本身的上下文。

总之,run_in_executor 在事件循环上调用是因为它允许事件循环管理协程和线程之间的并发。run_in_executor 返回的类似协程的对象本身不是协程,而是代表阻塞操作最终结果的asyncio.Future。事件循环处理协程和线程之间结果的协调,实现并发执行。

英文:

You're on the right track with your understanding of coroutines and asyncio. Let's break down the interaction between the event loop, coroutines, and threads.

run_in_executor is a method provided by the event loop to run a function in a separate thread (or process) and return an awaitable object (specifically, an asyncio.Future) that represents the eventual result of the function. This is useful when you have a blocking operation that you want to offload from the main thread running the event loop. You're correct that threads are scheduled by the OS and can be stopped at any point in their execution, making them preemptive multitasking.

When you call run_in_executor on the event loop, the event loop creates a new thread (or uses an existing thread in a thread pool) to execute the function you've provided. This new thread is indeed separate from the main thread running the event loop. The reason you call run_in_executor on the event loop is that it allows the event loop to manage the concurrency and coordinate the result of the blocking operation with the coroutines it's already handling.

Now, let's talk about the coroutine-like object that run_in_executor returns. When you await this object, you are not awaiting the actual thread, but rather an asyncio.Future object that represents the result of the function you've run in the separate thread. The Future object is a placeholder for the eventual result of the computation. By awaiting the Future, the event loop can continue to execute other coroutines until the result is ready.

When the function you've run in the separate thread completes, the separate thread updates the Future object with the result or any exception that occurred during execution. The event loop, which is monitoring the status of the Future, sees that the result is ready and schedules the coroutine that was awaiting the Future to resume execution. This way, the event loop manages the concurrency between the coroutines and the threads without needing to context-switch the threads themselves.

In summary, run_in_executor is called on the event loop because it allows the event loop to manage concurrency between coroutines and threads. The coroutine-like object returned by run_in_executor is not a coroutine itself but an asyncio.Future that represents the eventual result of the blocking operation. The event loop handles the coordination of results between threads and coroutines, allowing for concurrent execution.

答案2

得分: 1

Thinking that asyncio switches between coroutines and threads has it the wrong way around: It is threading that switches between asyncio and its executor.

The context switching between the two happens via the usual thread switching mechanism: The Global Interpreter Lock (GIL) regularly is handed from one thread to another.
Say asyncio is currently executing a coroutine, then the GIL-mechanism can pause the coroutine at any point and run an executor thread instead. Likewise, if an executor thread is running the GIL-mechanism may pause that thread and go back to running asyncio and its last coroutine instead.
This by itself is exactly the same behavior as used for unrelated threads, or threads of different event loops, or threads of different executors.

The special "magic sauce" that asyncio adds is that it registers a callback for any executor thread task: asyncio tells the executor to run some extra code that basically says "I am done, you can proceed". This in turn triggers asyncio to continue running each coroutine that was waiting on the executor task.

Notably, almost none of this is "async code". Submitting a task, having it run concurrently, triggering an action when done are all exactly equivalent to regular threading/concurrent.futures usage. asyncio merely adds a thin facade in front of it so that it looks async for code that must await the thread to complete.

A simple recreating of the mechanism would use an asyncio.Event to signal completion and instruct the thread to call loop.call_soon_threadsafe(event.set) when done. This makes the thread trigger the event, and thus wake up every coroutine waiting for the event.

async def as_thread(_sync_call, *args, **kwargs):
    """Simple 'thread coroutine' to run blocking code"""
    # prepare an event that another thread can trigger in this event loop
    done = asyncio.Event()
    loop = asyncio.get_running_loop()

    def wrapper():
        _sync_call(*args, **kwargs)
        # this makes the thread inform the loop that it is done
        loop.call_soon_threadsafe(done.set)

    # 100% regular Thread! The GIL and all other nasty thread stuff lurks here!
    threading.Thread(target=wrapper).start()
    # simply await for the thread to trigger the event when done
    await done.wait()

The added machinery of run_in_executor is mainly about bookkeeping (cancelling the task, ...) and providing a return value or exception.

英文:

Thinking that asyncio switches between coroutines and threads has it the wrong way around: It is threading that switches between asyncio and its executor.


The context switching between the two happens via the usual thread switching mechanism: The Global Interpreter Lock (GIL) regularly is handed from one thread to another.
Say asyncio is currently executing a coroutine, then the GIL-mechanism can pause the coroutine at any point and run an executor thread instead. Likewise, if an executor thread is running the GIL-mechanism may pause that thread and go back to running asyncio and its last coroutine instead.
This by itself is exactly the same behaviour as used for unrelated threads, or threads of different event loops, or threads of different executors.

The special "magic sauce" that asyncio adds is that it registers a callback for any executor thread task: asyncio tells the executor to run some extra code that basically says "I am done, you can proceed". This in turn triggers asyncio to continue running each coroutine that was waiting on the executor task.


Notably, almost none of this is "async code". Submitting a tasks, having it run concurrently, triggering an action when done are all exactly equivalent to regular threading/concurrent.futures usage. asyncio merely adds a thin facade in front of it so that it looks async for code that must await the thread to complete.

A simple recreating of the mechanism would use an asyncio.Event to signal completion, and instruct the thread to call loop.call_soon_threadsafe(event.set) when done. This makes the thread trigger the event, and thus wake up every coroutine waiting for the event.

async def as_thread(_sync_call, *args, **kwargs):
    """Simple 'thread coroutine' to run blocking code"""
    # prepare an event that another thread can trigger in this event loop
    done = asyncio.Event()
    loop = asyncio.get_running_loop()

    def wrapper():
        _sync_call(*args, **kwargs)
        # this makes the thread inform the loop that it is done
        loop.call_soon_threadsafe(done.set)

    # 100% regular Thread! The GIL and all other nasty thread stuff lurks here!
    threading.Thread(target=wrapper).start()
    # simply await for the thread to trigger the event when done
    await done.wait()

The added machinery of run_in_executor is mainly about bookkeeping (cancelling the task, ...) and providing a return value or exception.

huangapple
  • 本文由 发表于 2023年5月11日 12:08:10
  • 转载请务必保留本文链接:https://go.coder-hub.com/76224086.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定