Go使用什么类型的垃圾回收机制?

huangapple go评论85阅读模式
英文:

What kind of Garbage Collection does Go use?

问题

Go是一种垃圾回收语言:

http://golang.org/doc/go_faq.html#garbage_collection

这里说它是一种标记-清除垃圾回收器,但没有深入探讨细节,并且正在进行替换...然而,自从Go发布以来,这段话似乎没有更新过多。

它仍然是标记-清除吗?它是保守的还是精确的?它是分代的吗?

英文:

Go is a garbage collected language:

http://golang.org/doc/go_faq.html#garbage_collection

Here it says that it's a mark-and-sweep garbage collector, but it doesn't delve into details, and a replacement is in the works... yet, this paragraph seems not to have been updated much since Go was released.

It's still mark-and-sweep? Is it conservative or precise? Is it generational?

答案1

得分: 122

Go 1.4+垃圾回收器的计划:

  • 混合的停止-并发回收器
  • 停止-并发部分受到10毫秒的截止期限限制
  • CPU核心专用于运行并发回收器
  • 三色标记-清除算法
  • 非代际
  • 非压缩
  • 完全精确
  • 如果程序在移动指针,则会产生一些开销
  • 比Go 1.3 GC具有更低的延迟,但很可能也具有更低的吞吐量

Go 1.3垃圾回收器在Go 1.1的基础上进行了更新:

  • 并发清除(导致较小的暂停时间)
  • 完全精确

Go 1.1垃圾回收器:

  • 标记-清除(并行实现)
  • 非代际
  • 非压缩
  • 大部分精确(除了堆栈帧)
  • 停止-全局
  • 基于位图的表示
  • 当程序不分配内存时,零成本(即:在C中移动指针的速度与C中一样快,尽管实际上这比C运行得稍慢,因为Go编译器不像GCC等C编译器那样先进)
  • 支持对象上的终结器
  • 不支持弱引用

Go 1.0垃圾回收器:

  • 与Go 1.1相同,但是垃圾回收器是保守的,而不是大部分精确。保守的GC能够忽略诸如[]byte之类的对象。

用不同的垃圾回收器替换GC是有争议的,例如:

  • 除了非常大的堆外,不清楚代际GC是否总体上更快
  • “unsafe”包使得实现完全精确的GC和压缩GC变得困难
英文:

Plans for Go 1.4+ garbage collector:

  • hybrid stop-the-world/concurrent collector
  • stop-the-world part limited by a 10ms deadline
  • CPU cores dedicated to running the concurrent collector
  • tri-color mark-and-sweep algorithm
  • non-generational
  • non-compacting
  • fully precise
  • incurs a small cost if the program is moving pointers around
  • lower latency, but most likely also lower throughput, than Go 1.3 GC

Go 1.3 garbage collector updates on top of Go 1.1:

  • concurrent sweep (results in smaller pause times)
  • fully precise

Go 1.1 garbage collector:

  • mark-and-sweep (parallel implementation)
  • non-generational
  • non-compacting
  • mostly precise (except stack frames)
  • stop-the-world
  • bitmap-based representation
  • zero-cost when the program is not allocating memory (that is: shuffling pointers around is as fast as in C, although in practice this runs somewhat slower than C because the Go compiler is not as advanced as C compilers such as GCC)
  • supports finalizers on objects
  • there is no support for weak references

Go 1.0 garbage collector:

  • same as Go 1.1, but instead of being mostly precise the garbage collector is conservative. The conservative GC is able to ignore objects such as []byte.

Replacing the GC with a different one is controversial, for example:

  • except for very large heaps, it is unclear whether a generational GC would be faster overall
  • package "unsafe" makes it hard to implement fully precise GC and compacting GC

答案2

得分: 34

下一个Go 1.5的并发垃圾收集器涉及到能够“调节”该垃圾收集器。这里有一个在这篇论文中提出的建议,可能会在Go 1.5中实现,同时也有助于理解Go中的垃圾收集器。

你可以在1.5之前看到状态(停止世界:STW)

在Go 1.5之前,Go使用了一个并行的停止世界(STW)收集器。虽然STW收集有很多缺点,但至少具有可预测和可控的堆增长行为。

唯一的调优参数是“GOGC”,即相对于上一次收集时的活跃堆大小,堆增长的比例。默认设置为100%,即每当堆大小增加到前一次收集时的活跃堆大小的两倍时触发垃圾收集。

Go 1.5引入了一个并发收集器。这相比于STW收集有很多优势,但由于应用程序可以在垃圾收集器运行时分配内存,因此更难控制堆的增长。

为了实现相同的堆增长限制,运行时必须更早地开始垃圾收集,但是更早的时间取决于许多变量,其中许多变量无法预测。

  • 如果垃圾收集器启动得太早,应用程序将执行过多的垃圾收集,浪费CPU资源。
  • 如果垃圾收集器启动得太晚,应用程序将超过所需的最大堆增长。

在不牺牲并发性的情况下实现正确的平衡需要仔细调节垃圾收集器。

垃圾收集的调节旨在优化两个方面:堆增长和垃圾收集器使用的CPU。

垃圾收集调节的设计包括四个组件:

  1. 估算一个GC周期所需的扫描工作量的估算器,
  2. 一个机制,使得mutator在堆分配达到堆目标时执行估算的扫描工作量,
  3. 当mutator辅助程序未充分利用CPU预算时进行后台扫描的调度器,
  4. 用于GC触发的比例控制器。

该设计平衡了CPU时间和堆时间这两个不同的时间视图。

  • CPU时间类似于标准的挂钟时间,但是以GOMAXPROCS倍速度流逝。
    也就是说,如果GOMAXPROCS为8,则每秒钟的挂钟时间相当于8秒的CPU时间,垃圾收集器每秒钟获得2秒的CPU时间。
    CPU调度器管理CPU时间。
  • 堆时间的流逝以字节为单位,并随着mutator的分配而前进。

堆时间和挂钟时间之间的关系取决于分配速率,并且可能会不断变化。
Mutator辅助程序管理堆时间的流逝,确保在堆达到目标大小时已完成估算的扫描工作。
最后,触发控制器创建了一个将这两个时间视图联系在一起的反馈循环,优化堆时间和CPU时间的目标。

英文:

(For Go 1.8 - Q1 2017, see below)

The next Go 1.5 concurrent Garbage Collector involve being able to "pace" said gc.
Here is a proposal presented in this paper which might make it for Go 1.5, but also helps understand the gc in Go.

You can see the state before 1.5 (Stop The World: STW)

> Prior to Go 1.5, Go has used a parallel stop-the-world (STW) collector.
While STW collection has many downsides, it does at least have predictable and controllable heap growth behavior.

Go使用什么类型的垃圾回收机制?

<sup>(Photo from GopherCon 2015 presentation "Go GC: Solving the Latency Problem in Go 1.5")</sup>

The sole tuning knob for the STW collector was “GOGC”, the relative heap growth between collections. The default setting, 100%, triggered garbage collection every time the heap size doubled over the live heap size as of the previous collection:

Go使用什么类型的垃圾回收机制?

<sup>GC timing in the STW collector.</sup>

> Go 1.5 introduces a concurrent collector.
This has many advantages over STW collection, but it makes heap growth harder to control because the application can allocate memory while the garbage collector is running.

Go使用什么类型的垃圾回收机制?

<sup>(Photo from GopherCon 2015 presentation "Go GC: Solving the Latency Problem in Go 1.5")</sup>

> To achieve the same heap growth limit the runtime must start garbage collection earlier, but how much earlier depends on many variables, many of which cannot be predicted.

> - Start the collector too early, and the application will perform too many garbage collections, wasting CPU resources.

  • Start the collector too late, and the application will exceed the desired maximum heap growth.

> Achieving the right balance without sacrificing concurrency requires carefully pacing the garbage collector.

> GC pacing aims to optimize along two dimensions: heap growth, and CPU utilized by the garbage collector.

Go使用什么类型的垃圾回收机制?

> The design of GC pacing consists of four components:

> 1. an estimator for the amount of scanning work a GC cycle will require,
2. a mechanism for mutators to perform the estimated amount of scanning work by the time heap allocation reaches the heap goal,
3. a scheduler for background scanning when mutator assists underutilize the CPU budget, and
4. a proportional controller for the GC trigger.

> The design balances two different views of time: CPU time and heap time.

> - CPU time is like standard wall clock time, but passes GOMAXPROCS times faster.
That is, if GOMAXPROCS is 8, then eight CPU seconds pass every wall second and GC gets two seconds of CPU time every wall second.
The CPU scheduler manages CPU time.

  • The passage of heap time is measured in bytes and moves forward as mutators allocate.

> The relationship between heap time and wall time depends on the allocation rate and can change constantly.
Mutator assists manage the passage of heap time, ensuring the estimated scan work has been completed by the time the heap reaches the goal size.
Finally, the trigger controller creates a feedback loop that ties these two views of time together, optimizing for both heap time and CPU time goals.

答案3

得分: 21

这是GC的实现:

https://github.com/golang/go/blob/master/src/runtime/mgc.go

从源代码中的文档中可以看到:

GC与mutator线程并发运行,是类型准确(即精确)的,允许多个GC线程并行运行。它是一种并发的标记和清除算法,使用写屏障。它不是分代的,也不是压缩的。分配是使用按大小分隔的每个P分配区域来进行的,以最小化碎片化,同时在常见情况下消除锁定。

英文:

This is the implementation of the GC:

https://github.com/golang/go/blob/master/src/runtime/mgc.go

From the docs in the source:

> The GC runs concurrently with mutator threads, is type accurate (aka precise), allows multiple GC thread to run in parallel. It is a concurrent mark and sweep that uses a write barrier. It is non-generational and non-compacting. Allocation is done using size segregated per P allocation areas to minimize fragmentation while eliminating locks in the common case.

答案4

得分: 10

Go 1.8 GC可能会再次发展,采用“消除STW堆栈重新扫描”提案。

截至Go 1.7,无界和潜在的非平凡的STW时间的最后一个来源是堆栈重新扫描。

我们提议通过切换到一种混合写屏障来消除对堆栈重新扫描的需求,该写屏障结合了Yuasa风格的删除写屏障[Yuasa '90]和Dijkstra风格的插入写屏障[Dijkstra '78]。

初步实验表明,这可以将最坏情况下的STW时间减少到50微秒以下,并且这种方法可能使消除STW标记终止变得实际可行。

公告在这里,您可以看到相关的源提交是d70b0fe和之前。

英文:

Go 1.8 GC might evolve again, with the proposal "Eliminate STW stack re-scanning"

> As of Go 1.7, the one remaining source of unbounded and potentially non-trivial stop-the-world (STW) time is stack re-scanning.
>
> We propose to eliminate the need for stack re-scanning by switching to a hybrid write barrier that combines a Yuasa-style deletion write barrier [Yuasa '90] and a Dijkstra-style insertion write barrier [Dijkstra '78].
>
> Preliminary experiments show that this can reduce worst-case STW time to under 50µs, and this approach may make it practical to eliminate STW mark termination altogether.

The announcement is here and you can see the relevant source commit is d70b0fe and earlier.

答案5

得分: 3

我不确定,但我认为当前(tip)的GC已经是并行的,或者至少是一个正在进行中的工作。因此,停止世界的特性不再适用,或者在不久的将来也不会适用。也许其他人可以更详细地澄清这一点。

英文:

I'm not sure, but I think the current (tip) GC is already a parallel one or at least it's a WIP. Thus the stop-the-world property doesn't apply any more or will not in the near future. Perhaps someone other can clarify this in more detail.

huangapple
  • 本文由 发表于 2011年10月19日 23:24:24
  • 转载请务必保留本文链接:https://go.coder-hub.com/7823725.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定