英文:
When (and how) do executors yield control back to the event loop?
问题
I am trying to wrap my head around asyncio
for the first time, and I think I've got a basic grasp on coroutines & how to await their corresponding objects. Now, I've encountered AbstractEventLoop.run_in_executor
, and the concept makes sense in the abstract (no pun intended): you have some blocking operation, so you kick it off to a thread so that the main thread (i.e. the thread running the main event loop) can continue its work.
The thing that I do not understand is how the event loop manages context switches between coroutines (cooperative multitasking), and this newly created thread (preemptive multitasking). As I understand it, every coroutine is await
ed, and the act of await
ing that coroutine yields control back to the event loop. This allows the event loop to start running another coroutine. But threads are not cooperative in this manner -- there is some other scheduler (I believe in your OS, but that may be wrong) that schedules when threads run, and threads can be stopped at any point in their execution. Additionally, why do you even call run_in_executor
on the event loop? Shouldn't the newly created thread be completely separate from the thread that is running the event loop in order to allow for the concurrency that we're looking for?
My only guess at what could be happening is that since run_in_executor
returns a coroutine (which is also a little confusing -- how do you get a coroutine out of a thread?), await
ing this coroutine causes a context switch in the underlying thread, but I really have no idea how you could implement something like that.
英文:
I am trying to wrap my head around asyncio
for the first time, and I think I've got a basic grasp on coroutines & how to await their corresponding objects. Now, I've encountered AbstractEventLoop.run_in_executor
, and the concept makes sense in the abstract (no pun intended): you have some blocking operation, so you kick it off to a thread so that the main thread (i.e. the thread running the main event loop) can continue its work.
The thing that I do not understand is how the event loop manages context switches between coroutines (cooperative multitasking), and this newly created thread (preemptive multitasking). As I understand it, every coroutine is await
ed, and the act of await
ing that coroutine yields control back to the event loop. This allows the event loop to start running another coroutine. But threads are not cooperative in this manner -- there is some other scheduler (I believe in your OS, but that may be wrong) that schedules when threads run, and threads can be stopped at any point in their execution. Additionally, why do you even call run_in_executor
on the event loop? Shouldn't the newly created thread be completely separate from the thread that is running the event loop in order to allow for the concurrency that we're looking for?
My only guess at what could be happening is that since run_in_executor
returns a coroutine (which is also a little confusing -- how do you get a coroutine out of a thread?), await
ing this coroutine causes a context switch in the underlying thread, but I really have no idea how you could implement something like that.
答案1
得分: 3
你对协程和asyncio的理解是正确的。让我们分解事件循环、协程和线程之间的交互。
run_in_executor
是事件循环提供的方法,用于在单独的线程(或进程)中运行一个函数,并返回一个可等待对象(具体来说,是一个asyncio.Future
),它代表函数的最终结果。当你有一个希望从运行事件循环的主线程卸载的阻塞操作时,这非常有用。你是正确的,线程由操作系统调度,并且可以在执行过程中的任何时刻停止,这使它们支持抢占式多任务。
当你在事件循环上调用 run_in_executor
时,事件循环创建一个新的线程(或者使用线程池中的现有线程)来执行你提供的函数。这个新线程确实与运行事件循环的主线程分离。你之所以在事件循环上调用 run_in_executor
是因为它允许事件循环管理并协调阻塞操作的结果与它已经处理的协程。
现在,让我们谈谈 run_in_executor
返回的类似协程的对象。当你await
这个对象时,你正在等待的不是实际的线程,而是一个代表在单独线程中运行函数结果的 asyncio.Future
对象。Future
对象是计算最终结果的占位符。通过等待 Future
,事件循环可以继续执行其他协程,直到结果准备好。
当你在单独线程中运行的函数完成时,该线程会更新 Future
对象,其中包含结果或执行过程中发生的任何异常。事件循环监视 Future
的状态,看到结果已经准备好,然后安排等待 Future
的协程继续执行。这样,事件循环可以管理协程和线程之间的并发,而无需切换线程本身的上下文。
总之,run_in_executor
在事件循环上调用是因为它允许事件循环管理协程和线程之间的并发。run_in_executor
返回的类似协程的对象本身不是协程,而是代表阻塞操作最终结果的asyncio.Future
。事件循环处理协程和线程之间结果的协调,实现并发执行。
英文:
You're on the right track with your understanding of coroutines and asyncio. Let's break down the interaction between the event loop, coroutines, and threads.
run_in_executor
is a method provided by the event loop to run a function in a separate thread (or process) and return an awaitable object (specifically, an asyncio.Future
) that represents the eventual result of the function. This is useful when you have a blocking operation that you want to offload from the main thread running the event loop. You're correct that threads are scheduled by the OS and can be stopped at any point in their execution, making them preemptive multitasking.
When you call run_in_executor
on the event loop, the event loop creates a new thread (or uses an existing thread in a thread pool) to execute the function you've provided. This new thread is indeed separate from the main thread running the event loop. The reason you call run_in_executor
on the event loop is that it allows the event loop to manage the concurrency and coordinate the result of the blocking operation with the coroutines it's already handling.
Now, let's talk about the coroutine-like object that run_in_executor
returns. When you await
this object, you are not awaiting the actual thread, but rather an asyncio.Future
object that represents the result of the function you've run in the separate thread. The Future
object is a placeholder for the eventual result of the computation. By awaiting the Future
, the event loop can continue to execute other coroutines until the result is ready.
When the function you've run in the separate thread completes, the separate thread updates the Future
object with the result or any exception that occurred during execution. The event loop, which is monitoring the status of the Future
, sees that the result is ready and schedules the coroutine that was awaiting the Future
to resume execution. This way, the event loop manages the concurrency between the coroutines and the threads without needing to context-switch the threads themselves.
In summary, run_in_executor
is called on the event loop because it allows the event loop to manage concurrency between coroutines and threads. The coroutine-like object returned by run_in_executor
is not a coroutine itself but an asyncio.Future
that represents the eventual result of the blocking operation. The event loop handles the coordination of results between threads and coroutines, allowing for concurrent execution.
答案2
得分: 1
Thinking that asyncio
switches between coroutines and threads has it the wrong way around: It is threading
that switches between asyncio
and its executor.
The context switching between the two happens via the usual thread switching mechanism: The Global Interpreter Lock (GIL) regularly is handed from one thread to another.
Say asyncio
is currently executing a coroutine, then the GIL-mechanism can pause the coroutine at any point and run an executor thread instead. Likewise, if an executor thread is running the GIL-mechanism may pause that thread and go back to running asyncio
and its last coroutine instead.
This by itself is exactly the same behavior as used for unrelated threads, or threads of different event loops, or threads of different executors.
The special "magic sauce" that asyncio
adds is that it registers a callback for any executor thread task: asyncio
tells the executor to run some extra code that basically says "I am done, you can proceed". This in turn triggers asyncio
to continue running each coroutine that was waiting on the executor task.
Notably, almost none of this is "async
code". Submitting a task, having it run concurrently, triggering an action when done are all exactly equivalent to regular threading
/concurrent.futures
usage. asyncio
merely adds a thin facade in front of it so that it looks async
for code that must await
the thread to complete.
A simple recreating of the mechanism would use an asyncio.Event
to signal completion and instruct the thread to call loop.call_soon_threadsafe(event.set)
when done. This makes the thread trigger the event, and thus wake up every coroutine waiting for the event.
async def as_thread(_sync_call, *args, **kwargs):
"""Simple 'thread coroutine' to run blocking code"""
# prepare an event that another thread can trigger in this event loop
done = asyncio.Event()
loop = asyncio.get_running_loop()
def wrapper():
_sync_call(*args, **kwargs)
# this makes the thread inform the loop that it is done
loop.call_soon_threadsafe(done.set)
# 100% regular Thread! The GIL and all other nasty thread stuff lurks here!
threading.Thread(target=wrapper).start()
# simply await for the thread to trigger the event when done
await done.wait()
The added machinery of run_in_executor
is mainly about bookkeeping (cancelling the task, ...) and providing a return value or exception.
英文:
Thinking that asyncio
switches between coroutines and threads has it the wrong way around: It is threading
that switches between asyncio
and its executor.
The context switching between the two happens via the usual thread switching mechanism: The Global Interpreter Lock (GIL) regularly is handed from one thread to another.
Say asyncio
is currently executing a coroutine, then the GIL-mechanism can pause the coroutine at any point and run an executor thread instead. Likewise, if an executor thread is running the GIL-mechanism may pause that thread and go back to running asyncio
and its last coroutine instead.
This by itself is exactly the same behaviour as used for unrelated threads, or threads of different event loops, or threads of different executors.
The special "magic sauce" that asyncio
adds is that it registers a callback for any executor thread task: asyncio
tells the executor to run some extra code that basically says "I am done, you can proceed". This in turn triggers asyncio
to continue running each coroutine that was waiting on the executor task.
Notably, almost none of this is "async
code". Submitting a tasks, having it run concurrently, triggering an action when done are all exactly equivalent to regular threading
/concurrent.futures
usage. asyncio
merely adds a thin facade in front of it so that it looks async
for code that must await
the thread to complete.
A simple recreating of the mechanism would use an asyncio.Event
to signal completion, and instruct the thread to call loop.call_soon_threadsafe(event.set)
when done. This makes the thread trigger the event, and thus wake up every coroutine waiting for the event.
async def as_thread(_sync_call, *args, **kwargs):
"""Simple 'thread coroutine' to run blocking code"""
# prepare an event that another thread can trigger in this event loop
done = asyncio.Event()
loop = asyncio.get_running_loop()
def wrapper():
_sync_call(*args, **kwargs)
# this makes the thread inform the loop that it is done
loop.call_soon_threadsafe(done.set)
# 100% regular Thread! The GIL and all other nasty thread stuff lurks here!
threading.Thread(target=wrapper).start()
# simply await for the thread to trigger the event when done
await done.wait()
The added machinery of run_in_executor
is mainly about bookkeeping (cancelling the task, ...) and providing a return value or exception.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论