If memory fragmentation is no longer an issue with 64-bit virtual address space, why does garbage collector in some languages need to compact?

huangapple go评论77阅读模式
英文:

If memory fragmentation is no longer an issue with 64-bit virtual address space, why does garbage collector in some languages need to compact?

问题

内存碎片化在64位虚拟地址空间中似乎不再是一个问题,那么为什么一些流行的语言(如V8 JavaScript引擎、JVM等)中的垃圾收集器在标记-清除后仍然需要压缩内存以防止堆碎片化呢?

英文:

From what I got here:

Memory fragmentation seems no longer an issue in 64-bit virtual address space, so why the garbage collector in some popular languages, V8 js, JVM, etc. still need to compact memory after mark-sweep to prevent heap fragmentation?

答案1

得分: 2

(V8 开发者在此。)

地址空间碎片化与虚拟内存堆碎片化并不相同。

在 32 位系统上,即使总共有 2-3GB 的内存可用,分配一个 256MB 的对象可能会出奇地不太可能,因为现有的对象分散在整个地址空间中,因此不再能够找到连续的 256MB 区域。这是 64 位系统通常不再遇到的问题。

带有垃圾收集器的虚拟机通常会将其托管堆组织成“页面”。作为一个心理模型,你可以假设每个页面的大小为1MB。
虚拟机可以在需要更多堆时向其添加页面,并在它们为空闲时将它们返回给操作系统。现在,可能会出现这样的情况:某个页面曾经被大量使用,然后大多数对象都死亡了,现在整个1MB页面仅用于一个只有几个字节大小的对象。当应用程序经历了需要大量内存的阶段(因此需要许多堆页面),然后该操作完成并且大多数对象不可访问时,大多数堆页面可能大部分为空闲,仅由少量/小对象使用。这是一种浪费内存的特殊形式:虚拟机需要保留许多堆页面,但所有活动对象的总大小要远小于所有堆页面的总大小(这又是从操作系统的角度来看进程正在使用的内存量的一部分)。
这就是堆碎片化。无论你在 32 位还是 64 位系统上运行,都与之无关。避免它的方法是拥有一个“紧凑”的垃圾收集器,即使它将对象移动在一起,以便某些页面完全为空闲,并可以还给操作系统。


附注:地址空间的 48 位(实际上是 47 位,一个位用于内核)并不像乍看起来那么难以耗尽。当应用程序(如虚拟机)有“哦,我们有接近无限的地址空间,所以让我们在这个东西周围预留一个4GB的地址空间 'cage',这将允许我们进行一些有趣的性能技巧/创建一些有趣的安全保证/等等”的想法,然后某个用例想要成千上万个那个东西,那么你可能会在预期之前遇到地址空间限制。

英文:

(V8 developer here.)

Address space fragmentation is not the same as VM heap fragmentation.

On 32-bit systems, it can be surprisingly unlikely to be able to allocate, say, a 256MB object even if 2-3 GB of memory are available in total, because existing objects are spread out all across the address space, so that no contiguous 256MB region can be found any more. That's a problem that 64-bit systems (usually) don't have any more.

VMs with garbage collectors usually organize their managed heap in "pages". As a mental model, you may assume that each page is 1MB in size.
The VM can add pages to its heap (when it needs more) and give them back to the operating system (when they're empty). Now, it can happen that a page was heavily used, then most objects on it died, and now the entire 1MB page is only used for a single object that's just a few bytes in size. When an application went through a phase where it needed lots of memory (and hence many heap pages), and then that operation completed and most objects became unreachable, it can happen that most pages on the heap are mostly empty, only used by few/small objects each. That's a particular form of wasting memory: the VM needs to hold on to many heap pages, but the total size of all live objects is much smaller than the total size of all heap pages (which in turn is [part of] the amount of memory that the process is using from the operating system's point of view).
That's heap fragmentation. Whether you're running on a 32-bit or 64-bit system has nothing to do with it. And the way to avoid it is to have a "compacting" garbage collector, i.e. to have it move objects together so that some pages become entirely free and can be given back to the operating system.


Side note: 48 bits of address space (actually 47 bits, one bit is for the kernel) is not as impossible to exhaust as it seems at first. When applications (like virtual machines) have ideas like "oh, we have near-infinite address space, so let's reserve a 4GB 'cage' of address space around this thing, which would allow us to play some interesting performance tricks / create some interesting security guarantees / etc", and then some use case wants thousands of whatever that thing is, then you can run into address space limits before you expect it.

huangapple
  • 本文由 发表于 2023年8月9日 18:33:11
  • 转载请务必保留本文链接:https://go.coder-hub.com/76866904.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定