英文:
Goroutines 8kb and windows OS thread 1 mb
问题
作为一个Windows用户,我知道操作系统线程由于“默认情况下,Windows为每个线程的用户模式堆栈分配1MB的内存”而消耗大约1MB的内存。如果操作系统线程非常贪婪,那么golang
如何为每个goroutine
使用大约8KB的内存呢?goroutine
是一种虚拟线程吗?
英文:
As windows user, I know that OS threads consume ~1 Mb of memory due to By default, Windows allocates 1 MB of memory for each thread’s user-mode stack.
How does golang
use ~8kb of memory for each goroutine
, if OS thread is much more gluttonous. Are goroutine
sort of virtual threads?
答案1
得分: 5
Goroutines不是线程,它们是独立的并发控制线程,或者说是在同一地址空间内的_goroutine_。《Effective Go》将它们定义为:它们被称为_goroutines_,因为现有的术语——线程、协程、进程等——传达了不准确的内涵。goroutine有一个简单的模型:它是与同一地址空间中的其他goroutine并发执行的函数。它是轻量级的,成本几乎只有分配堆栈空间。而且堆栈的初始大小很小,所以它们很廉价,并且通过根据需要分配(和释放)堆存储来增长。
Goroutines没有自己的线程。相反,多个goroutine可以(可能)复用同一个操作系统线程,因此如果一个goroutine被阻塞(例如等待I/O或阻塞通道操作),其他goroutine将继续运行。
实际上,同时执行goroutine的线程数可以使用runtime.GOMAXPROCS()
函数进行设置。引用自runtime
包的文档:
GOMAXPROCS变量限制了可以同时执行用户级Go代码的操作系统线程的数量。在代表Go代码进行系统调用时,可以阻塞的线程数没有限制;它们不计入GOMAXPROCS的限制。
请注意,默认情况下,当前实现只使用1个线程来执行goroutine。
英文:
Goroutines are not threads, they are (from the spec):
> ...an independent concurrent thread of control, or goroutine, within the same address space.
Effective Go defines them as:
> They're called goroutines because the existing terms—threads, coroutines, processes, and so on—convey inaccurate connotations. A goroutine has a simple model: it is a function executing concurrently with other goroutines in the same address space. It is lightweight, costing little more than the allocation of stack space. And the stacks start small, so they are cheap, and grow by allocating (and freeing) heap storage as required.
Goroutines don't have their own threads. Instead multiple goroutines are (may be) multiplexed onto the same OS threads so if one should block (e.g. waiting for I/O or a blocking channel operation), others continue to run.
The actual number of threads executing goroutines simultaneously can be set with the runtime.GOMAXPROCS()
function. Quoting from the runtime
package documentation:
> The GOMAXPROCS variable limits the number of operating system threads that can execute user-level Go code simultaneously. There is no limit to the number of threads that can be blocked in system calls on behalf of Go code; those do not count against the GOMAXPROCS limit.
Note that in current implementation by default only 1 thread is used to execute goroutines.
答案2
得分: 2
1 MiB是默认值,正如你正确指出的那样。你可以轻松选择自己的堆栈大小(然而,最小值仍然远高于约8 kiB)。
话虽如此,goroutine并不是线程。它们只是具有协作式多任务处理的任务,类似于Python的任务。goroutine本身只是执行所需的代码和数据;还有一个单独的调度器(在一个或多个操作系统线程上运行),实际上执行该代码。
伪代码如下:
无限循环
从队列中获取任务
执行任务
结束循环
当然,“执行任务”部分可以非常简单或非常复杂。你可以做的最简单的事情就是执行给定的委托(如果你的语言支持这样的东西)。实际上,这只是一个方法调用。在更复杂的场景中,还可能涉及恢复某种上下文、处理继续和协作任务的让步等等。
这是一种非常轻量级的方法,在进行异步编程时非常有用(现在几乎是所有事情:))。现在许多语言都支持类似的功能- Python是我见过的第一个具有这种功能的语言(“tasklets”),早在Go之前。当然,在没有抢占式多线程的环境中,这几乎是默认值。
例如,在C#中,有Task
。它们与goroutine并不完全相同,但在实践中,它们非常接近-主要区别在于Task
使用线程池中的线程(通常),而不是单独的“调度器”线程。这意味着如果你启动1000个任务,它们有可能由1000个单独的线程运行;实际上,这需要你编写非常糟糕的Task
代码(例如,仅使用阻塞I/O、休眠线程、等待等待句柄等)。如果你将Task
用于异步非阻塞I/O和CPU工作,它们与goroutine非常接近-在实际应用中。理论上有一些不同:)
编辑:
为了消除一些困惑,这是一个典型的C#异步方法的示例:
async Task<string> GetData()
{
var html = await HttpClient.GetAsync("http://www.google.com");
var parsedStructure = Parse(html);
var dbData = await DataLayer.GetSomeStuffAsync(parsedStructure.ElementId);
return dbData.First().Description;
}
从GetData
方法的角度来看,整个处理过程是同步的-就好像你根本没有使用异步方法一样。关键的区别在于,在“等待”期间不会使用线程;但是忽略这一点,它几乎与编写同步阻塞代码完全相同。当然,这也适用于任何共享状态的问题-当然,在await
和阻塞多线程I/O中的多线程问题之间没有太大的区别。使用Task
更容易避免,但这仅仅是因为你拥有的工具,而不是因为Task
所做的任何“魔术”。
在这方面与goroutine的主要区别在于,Go实际上没有通常意义上的阻塞方法。它们不是阻塞,而是将它们特定的异步请求排队并让步。当操作系统(以及Go中的其他层-我对内部工作原理没有深入了解)接收到响应时,它将其发布到goroutine调度器,后者知道等待响应的goroutine现在已准备好恢复执行;当它实际上获得一个时间片时,它将从“阻塞”调用处继续执行-但实际上,这与C#的await
非常相似。没有根本的区别- C#的方法和Go的方法之间有相当多的差异,但它们并不都是那么巨大。
还要注意,这在没有抢占式多任务处理的旧Windows系统上基本上是相同的方法-任何“阻塞”方法只会将线程的执行让给调度器。当然,在那些系统上,你只有一个CPU核心,所以不能同时执行多个线程,但原理仍然是相同的。
英文:
1 MiB is the default, as you correctly noted. You can pick your own stack size easily (however, the minimum is still a lot higher than ~8 kiB).
That said, goroutines aren't threads. They're just tasks with coöperative multi-tasking, similar to Python's. The goroutine itself is just the code and data required to do what you want; there's also a separate scheduler (which runs on one on more OS threads), which actually executes that code.
In pseudo-code:
loop forever
take job from queue
execute job
end loop
Of course, the execute job
part can be very simple, or very complicated. The simplest thing you can do is just execute a given delegate (if your language supports something like that). In effect, this is simply a method call. In more complicated scenarios, there can be also stuff like restoring some kind of context, handling continuations and coöperative task yields, for example.
This is a very light-weight approach, and very useful when doing asynchronous programming (which is almost everything nowadays :)). Many languages now support something similar - Python is the first one I've seen with this ("tasklets"), long before go. Of course, in an environment without pre-emptive multi-threading, this was pretty much the default.
In C#, for example, there's Task
s. They're not entirely the same as goroutines, but in practice, they come pretty close - the main difference being that Task
s use threads from the thread pool (usually), rather than a separate dedicated "scheduler" threads. This means that if you start 1000 tasks, it is possible for them to be run by 1000 separate threads; in practice, it would require you to write very bad Task
code (e.g. using only blocking I/O, sleeping threads, waiting on wait handles etc.). If you use Task
s for asynchronous non-blocking I/O and CPU work, they come pretty close to goroutines - in actual practice. The theory is a bit different
EDIT:
To clear up some confusion, here is how a typical C# asynchronous method might look like:
async Task<string> GetData()
{
var html = await HttpClient.GetAsync("http://www.google.com");
var parsedStructure = Parse(html);
var dbData = await DataLayer.GetSomeStuffAsync(parsedStructure.ElementId);
return dbData.First().Description;
}
From point of view of the GetData
method, the entire processing is synchronous - it's just as if you didn't use the asynchronous methods at all. The crucial difference is that you're not using up threads while you're doing the "waiting"; but ignoring that, it's almost exactly the same as writing synchronous blocking code. This also applies to any issues with shared state, of course - there isn't much of a difference between multi-threading issues in await
and in blocking multi-threaded I/O. It's easier to avoid with Task
s, but just because of the tools you have, not because of any "magic" that Task
s do.
The main difference from goroutines in this aspect is that Go doesn't really have blocking methods in the usual sense of the word. Instead of blocking, they queue their particular asynchronous request, and yield. When the OS (and any other layers in Go - I don't have deep knowledge about the inner workings) receives the response, it posts it to the goroutine scheduler, which in turns knows that the goroutine that "waits" for the response is now ready to resume execution; when it actually gets a slot, it will continue on from the "blocking" call as if it had really been blocking - but in effect, it's very similar to what C#'s await
does. There's no fundamental difference - there's quite a few differences between C#'s approach and Go's, but they're not all that huge.
And also note that this is fundamentally the same approach used on old Windows systems without pre-emptive multi-tasking - any "blocking" method would simply yield the thread's execution back to the scheduler. Of course, on those systems, you only had a single CPU core, so you couldn't execute multiple threads at once, but the principle is still the same.
答案3
得分: 0
goroutines是我们所称的"绿色线程"。它们不是操作系统线程,而是由Go调度器负责管理。这就是为什么它们可以具有更小的内存占用的原因。
英文:
goroutines are what we call green threads. They are not OS threads, the go scheduler is responsible for them. This is why they can have much smaller memory footprints.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论