Shenandoah垃圾收集器负载引用屏障

huangapple go评论79阅读模式
英文:

Shenandoah Garbage Collector Load Reference Barriers

问题

这对于观察过Shenandoah发展的人来说并不是什么秘密,一个主要的批评是它在每次写入和读取时都使用GC屏障,无论是引用还是基本类型。

Shenandoah 2.0声称这不再是一个问题,通过所谓的“加载引用屏障”来解决。具体是如何实现的呢?

英文:

It is not a big secret for people who have watched the development of Shenandoah that a major criticism is that it employs GC barriers for every single write and read : be it reference or primitive.

Shenandoah 2.0 claims that this is not a problem anymore and it is solved via so-called load reference barriers. How is this exactly happening?

答案1

得分: 5

我将假设读者知道什么是屏障以及为什么需要它。关于这个主题,我在另一个回答中有一个非常简短的介绍链接

为了正确理解这一点,我们首先需要看一下最初的问题出在哪里。让我们来看一个相当简单的例子:

static class User {
     private int zip;
     private int age;
}

static class Holder {

     private User user;
     // 其他我们不关心的字段

}

现在让我们想象一个理论上的方法:

public void access(Holder holder){
     User user = holder.user;
     for(;;){ // 一些循环在这里
         int zip = user.zip;
         System.out.println(zip);

         user.age = // 例如从循环中获取的某个值
     }
}

这里的想法不是展示一个“正确”的例子,而是一个包含:

  • 一个 读取 (user.zip;)

  • 一个 写入 (user.age = ...)

现在因为 Shenandoah 1.0 需要在 所有地方 引入屏障,所以这段代码会变成这样:

public void access(Holder holder){
  User user = RB(holder).user;  
  for(;;){ // 一些循环在这里
      int zip = RB(user).zip;
      System.out.println(zip);

      WB(user).age = // 例如从循环中获取的某个值
  }
}

请注意 RB(holder).userRB 代表 读取屏障)和 WB(user).ageWB 代表 写入屏障)。现在想象一下循环是“热”的 - 你会为这么多的屏障付出代价。即使在循环执行期间没有GC活动,这些屏障仍然存在,必须有代码在需要执行或不执行屏障时进行有条件地检查。

长话短说:这些屏障绝不是免费的。

这些屏障是为了维护堆一致性,因为在 疏散阶段 期间内存中有对象的 两个 拷贝,所以必须始终一致地读取和写入。这里的“一致”是指在 Shenandoah 1.0 中,读取 可能发生在“to-space”或“from-space”(称为“弱 to-space 不变量”)中,而 写入 只能发生在 to-space 中。


Shenandoah 2.0 表示它将确保所谓的“to-space 不变量”(与先前的 不变量相对)。基本上 - 它表示所有的写入和读取都将发生在“to-space”中。在疏散期间,对象有两个拷贝:一个在旧区域(称为“from-space”),一个在新区域(称为“to-space”)。

它通过一个相当简单但聪明的想法来实现这个“to-space”不变量。它不是在发生写入的地方使用屏障,而是确保最初加载的对象肯定是从“to-space”加载的。这是通过 load-reference-barriers 实现的。通过对先前的示例进行重构,更容易理解:

public void access(Holder holder){
    User user = LRB(holder).user;  
    for(;;){ // 一些循环在这里
        int zip = user.zip;
        System.out.println(zip);

        user.age = // 例如从循环中获取的某个值
    }
}

我们引入了 LRB 屏障并去除了其他两个。因此,加载引用屏障发生在对象加载时,在其使用位置调用此方法。你可以将其想象成在使用 aloadgetField(用于引用)的地方插入这些屏障。

英文:

I will assume that the reader knows what a barriers is and why it is needed. For a very short intro here is another answer of mine on the topic.

In order to properly understand that, we need to first look at where the initial problem really was. Let's take a rather simple example:

static class User {
     private int zip;
     private int age;
}

static class Holder {
    
     private User user;
     // other fields we don't care about
      
}

And now let's image a theoretical method like this:

public void access(Holder holder){
     User user = holder.user;
     for(;;){ // some loop here
         int zip = user.zip;
         System.out.println(zip);

         user.age = // some value taken from the loop for example
     }
}

The idea is not to show a correct example, but an example that does:

  • a read (user.zip;)

  • a write (user.age = ...)

Now because Shenandoah 1.0 needed to introduce barriers everywhere, this code would look:

public void access(Holder holder){
  User user = RB(holder).user;  
  for(;;){ // some loop here
      int zip = RB(user).zip;
      System.out.println(zip);

      WB(user).age = // some value taken from the loop for example
  }
}   

Notice the RB(holder).user (RB stands for read barrier) and WB(user).age (WB stands for write barrier). Now imagine that the loop is hot - you will pay the price for so many barriers. Even if there is no GC activity during the execution of the loop, the barriers are still in place and there has to be code that conditionally checks if the barrier needs to be executed or not.

Long story short: those barriers are not free, by any means.

These barriers are needed to maintain heap consistency, because there are two copies of an Object in memory during evacuation phase, you need to always read and write consistently. Consistently here means that in Shenandoah 1.0 a read could have happened from the "to-space" or "from-space" (called "weak to-space invariant"), while a write could happen from to-space only.


Shenandoah 2.0 says that it will ensure a so-called "to-space invariant" (as opposed to the previous weak one). Basically - it says that all the writes and reads are going to happen from/into the "to-space". During evacuation there are two copies of the Object: one in the old region (called "from-space") and one in the new region (called "to-space").

It achieves this "to-space" invariant with a rather simple, yet brilliant idea. Instead of employing barriers where writes happen, it ensures that the Object that was initially loaded was for sure loaded from the "to-space". This is done via load-reference-barriers. This is far more trivial to understand via refactoring the previous example:

  public void access(Holder holder){
      User user = LRB(holder).user;  
      for(;;){ // some loop here
          int zip = user.zip;
          System.out.println(zip);

          user.age = // some value taken from the loop for example
      }
  }

We have introduced the LRB barrier and removed two other. So, load-reference-barriers happen when an object is loaded, they call this : at the definition site, instead of when reading or storing to it, they call this at their use-site. You can think about it as if these barriers are inserted where aload and getField (for references) is used.

huangapple
  • 本文由 发表于 2020年9月20日 11:04:16
  • 转载请务必保留本文链接:https://go.coder-hub.com/63975139.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定