英文:
What is the purpose of Survivor Space in Java memory?
问题
尝试查找了这个问题,但我找到的所有问题/答案都涉及到为什么要有两个幸存者空间的目的。我想了解一般情况下设置幸存者空间的目的。将对象从伊甸园区移到幸存者空间有什么好处呢?
英文:
Tried looking up for this, but all the questions/answers I came across talk about the purpose of having 2 survivor spaces. I would like to understand the purpose of having survivor space in general. How does moving objects from Eden to Survivor benefit?
答案1
得分: 10
Performance.
一般来说,将堆分割(无论是代际分割还是其他任何判别方式)被认为是一件相当好的事情,尽管并不是所有的垃圾收集器都遵循这个方式(例如 Shenandoah
就不是这样的收集器)。
为什么这是一件好事呢?扫描整个堆寻找存活对象需要时间。你如何告诉你的垃圾收集器:“现在是运行的时候了”。什么时候才是“那个”时候呢?你可以说:每分配 100 个对象后运行一次。这会不会太快?(如果这些对象的大小只占堆的一小部分呢?)或者更糟糕的是:会不会太晚?如果你说:在堆占用的 65% 处触发收集(G1
在这个百分比下触发一次 major 收集,当然还有其他可能,默认情况下)。如果在这个 65% 处你发现大部分对象早些时候就应该被收集,它们已经在堆中停留了太长时间。
你可以看到这很快变得复杂起来。当你了解到扫描堆需要时间,你最不想看到的就是你的应用在进行垃圾收集时停滞不前。但请也要记住,有些收集器可以并发地扫描堆,因此它们不会有这个问题(如 Shenandoah
、ZGC
或 C4
)。
如果你能够分割堆,你只需要扫描其中的一部分,从而节省时间。人们称它们为“minor”收集。因此,一些收集器将堆分为“young”和“old”两部分,这种分离基于“婴儿期死亡率”:年轻对象死亡得快。因此,如果你进行这种分离 + 年轻对象很快就死亡,你可以只扫描堆的一部分,在大多数情况下只处理这部分。这也简化了一个问题的答案:垃圾收集应该在什么时候运行?当年轻代满时,当然是这个时候。
现在谈谈你的直接问题:为什么需要 Survivor(幸存者)?假设没有它会发生什么。第一个垃圾收集周期发生(年轻代区域已满,让我们准确点叫它 Eden
),接下来会发生什么?垃圾收集器需要确定在那里有哪些存活对象,将其移动到“老年代”,清理 Eden
并重新开始分配。第二个周期进行相同的操作,依此类推,直到垃圾收集器说:“老年代已满,我不能再移动了”。这通常是一个昂贵的过程。
但是我们确实了解到了“婴儿期死亡率”。我们知道第二和第三个垃圾收集周期将一些本应在第四阶段被收集的对象移动到了老年代。这个机会被错过了。因此:幸存者空间。它在那里保留对象“比一个垃圾收集周期稍微长一点”的时间(称为幸存者年龄),知道在不久的将来这些对象将成为垃圾。因此,不需要经常扫描老年代,只需要扫描和处理堆的一个“较小”部分(Eden
和 Survivor
)。至于为什么有两个幸存者空间,那是一个单独的问题...
实际上,最新的垃圾收集器不需要这样做。它们找到了一种在应用程序运行时并发扫描堆的方法,因此它们不需要这些空间。年轻对象死亡的前提仍然存在,一些垃圾收集算法可能会利用这一点,现在或将来。
英文:
Performance.
In general, splitting the heap ( be that generational or any other discriminator ), was seen as a rather good thing, not all collectors follow that though ( Shenandoah
is not such a collector for example ).
Why is that good thing? It takes time to scan the entire heap for alive Objects. How do you tell your garbage collector - "time to run now". When is that time? You could say : run after every 100-th allocated Object. Is that too soon? ( what if the size of these objects is only a tiny fraction of the heap ) or worse : is it too late? What if you say: trigger a collection at 65% of the heap occupancy ( G1
triggers a major collection at that percentage, among other possibilities, by default ). What if at that 65% you find out that the majority of Objects should have been collected a lot earlier, they have been staying in the heap for far too much time.
You can see that this becomes complicated fast. Things get worse when you understand that scanning the heap takes time, and the last thing you want is for you application to stall, when GC is running. But please also bear in mind that there are collectors that scan the heap concurently, so they don't have this problem ( Shenandoah
, ZGC
or C4
).
If you could separate the heap, you could scan only a portion of it, thus taking little time. People call them "minor" collections. Some collectors thus divide the heap in "young" and "old", this separation comes on the premises of "infant mortality" : young objects die fast. Thus, if you do this separation + young objects die soon, you can scan only a certain portion of the heap and in the majority of cases only deal with that. This also simplifies the answer of : when a GC is supposed to run? When young is full, of course.
And now to your direct point: why is a Survivor needed, at all. Let's assume it isn't there. The first GC cycle happens ( young region is full, let's call it Eden
to be exact), what happens next? GC needs to tell what is alive there, move it to "old generation", clear Eden
and start allocating again. Second cycle comes in and does the same thing and so on, until GC says : "old generation if full, I can't move anymore". This is the place where an famous "old generation" happens. It's usually costly.
But we do know about "infant mortality" here. We do know that the second and third GC cycle moved some objects to the old generation that would have been collected at the fourth phase. This opportunity was missed. As such : Survivor space. It keeps objects in there for "a little longer" then a single GC cycle ( called survivor age ), knowing that in the nearest future this will become garbage. Thus, no need to scan the old often, only scan and take care of a smaller portion of the heap (Eden
and Survivor
). As to why there are two Survivor spaces, its a separate question...
In reality, latest GCs don't need that. They found a way to scan the heap concurently, while your application is running, so they don't have these spaces. The premises of young death still exists, and some GC algorithms might use that; now or in the future.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论