lambdas如何存储对变量的引用

huangapple go评论97阅读模式
英文:

How do lambdas store references to variables

问题

我感觉这是一个简单的问题,但我找不到答案。在执行这段简单的代码之后:

public class MyClass {
    
    int value;

    static interface MyFunctionalInterface {
        void foo();
    }

    public static void main(String args[]) {
       MyClass xxx = new MyClass();
       MyFunctionalInterface myLambda = () -> System.out.println("Value: " + xxx.value);

       xxx.value = 777;
       myLambda.foo();
       
       xxx.value = 888;
       bar(myLambda);
    }
    
    static void bar(MyFunctionalInterface myLambda) {
        myLambda.foo();
    }

}

结果(如预期)只是:

Value: 777
Value: 888

我的问题是关于 xxx 对象。Lambda 如何存储对这个对象的引用?是否创建了某个匿名类,其中存储了 "Value: " 字符串和 xxx 对象?这个对象在 bar 函数中仍然存在,所以必须以某种方式存储。

编辑:
根据 @Sweeper 的提示,可以得出结论,Lambda 在内部存储对这些变量的引用。
如果(使用上面的代码)我们打印:

System.out.println(Arrays.toString(myLambda.getClass().getDeclaredFields()));

我们将得到:

[private final MyClass MyClass$$Lambda$1/0x0000000800000c08.arg$1]

但是,如果修改代码并且创建 lambda 时不使用 xxx:

MyFunctionalInterface myLambda = () -> System.out.println("Just string");

那么声明的字段将为空,它将只打印 "[]"。

英文:

I feel that this is an easy question but I was unable to find answer. After executing this simple code:

public class MyClass {
    
    int value;

    static interface MyFunctionalInterface {
        void foo();
    }

    public static void main(String args[]) {
       MyClass xxx = new MyClass();
       MyFunctionalInterface myLambda = () -> System.out.println("Value: " + xxx.value);

       xxx.value = 777;
       myLambda.foo();
       
       xxx.value = 888;
       bar(myLambda);
    }
    
    static void bar(MyFunctionalInterface myLambda) {
        myLambda.foo();
    }

}

The result (as expected) is just:

Value: 777
Value: 888

My question is about xxx object. How does the lambda store reference to this object? Is there some anonymous class created which stores "Value: " string and xxx object? This object still exists in bar function, so it has to be stored somehow somewhere.

Edit:
Basing on hint from @Sweeper it can be concluded that lambdas store references to such variables inside themselves.
If (with above code) we print:

System.out.println(Arrays.toString(myLambda.getClass().getDeclaredFields()));

we will get:

[private final MyClass MyClass$$Lambda$1/0x0000000800000c08.arg$1]

However, if code is modified and lambda created without xxx:

MyFunctionalInterface myLambda = () -> System.out.println("Just string");

Then declared fields will be empty and it will print just "[]"

答案1

得分: 3

首先,让我确保你理解 Java 引用和对象的工作方式,否则解释将毫无意义:

MyClass xxx = new MyClass();

这个代码[A] 在堆上保留了足够大的空间来存储 MyClass 的一个实例,[B] 构造了一个新的 MyClass 实例到这个空间,然后[C] 将指向这个对象在堆上位置的指针赋给了 xxx。虽然从[C]来看,“指针”似乎意味着你可以增加它或打印它,但作为额外说明,我们不称其为“指针”,而称其为“引用”。但实际上,它就是一个指针。

xxx.value = 777;

这并不会“改变”任何东西 - xxx. 表示“取变量 xxx跟随指针,然后在堆上更改东西。这关键意味着 xxx 本身并没有改变。只有 xxx 指向的对象在这里被改变。

因此,在这个方法中,xxx实际上是不可变的。它是一个常量。这意味着我们需要确保 lambda 继续知道它。鉴于 lambda 可能“超越”它所创建的方法的上下文,这是一个问题:通常情况下,本地变量(方法参数也是一种本地变量)在栈上声明,因此当方法结束时就会消失。很明显,xxx 不能驻留在那里。至少不是 lambda 使用的那个 xxx

Java 解决这个问题的方式与大多数编程语言不同。大多数语言意识到 xxx 必须超越方法的生存期并将其“提升到堆上”。Java 不这样做(毕竟,存在于堆上的东西,或者可以通过 lambda 逃逸到其他线程,突然需要回答棘手的问题,例如“这是否意味着本地变量现在可以标记为 volatile?”)。相反,Java 只是对它进行克隆。因为如果代码还修改了该变量,那么克隆将变得非常混乱(你要修改哪个克隆?),除非 xxxfinal,否则你的代码甚至无法编译! - Java 给你的一个“礼物”是,如果你的变量只写入一次,即你的变量_本可以_标记为 final 而不会出现问题,那么你可以省略 final,Java 会自动假设它是 final。试试看!尝试在 lambda 中或在 lambda 外部重新分配 xxx(不重要),你会得到一个错误,告诉你无法在 lambda 中使用 xxx,因为它不是“实际上是不可变的”(Java 规范中的术语是“要么标记为 final,要么可以被标记为 final 而不会出问题”)。

所以现在,就是:在方法本身的上下文中,xxx 仍然是一个像往常一样分配在栈上的变量。而 lambda 得到它的一个克隆,是的,它存在于堆上。有一个表示 lambda 的对象,其中包含了它所捕获的状态。编译器知道 lambda 只捕获 lambda 实际使用的变量(可能包括 this!),并仅捕获所需的内容。

捕获零状态的 lambda 特殊之处在于它们根本不需要对象来表示它们的状态。相反,它们最终被视为静态方法,lambda 以 MethodRef 对象的形式存在于类/JVM 级别。

无论你的 lambda 是否具有状态,你都不应该使用它们的对象标识;JVM 规范表示你的代码不会直接崩溃,但它们的唯一性是否具有有意义的独特性不能保证。换句话说,不要将它们用作映射的键,不要在它们上锁(synchronized)。不要使用 == 比较它们。

英文:

First let me just doublecheck that you understand the way java references and objects work or the explanation isn't going to make any sense:

> MyClass xxx = new MyClass();

This [A] reserves some space on the heap large enough to store 1 instance of MyClass, [B] constructs a new instance of MyClass into that space, and then [C] assigns the pointer pointing at the location where this object now resides on the heap to xxx. Because 'pointer', from C, kinda implies you can e.g. increment it or print it, as an encore, we shall not call it a 'pointer' and instead we call it a 'reference'. But, it's.. a pointer.

> xxx.value = 777;

This doesn't 'change' anything - xxx. is 'take variable xxx, follow the pointer, then go change stuff over on the heap. Which, crucially, means xxx does not change. Only the object xxx is pointing at is being changed here.

Thus, xxx is effectively final throughout this method. It's a constant. Which means the only thing we need to do here is ensure the lambda continues to know about it. Given that a lambda can 'outlive' the context of the method it is created in, this is a bit of an issue: Ordinarily, local variables (and method parameters are a kind of local var) are declared on stack and therefore poof out of existence when a method ends. Clearly then, xxx cannot live there. At least, not the one the lambda uses.

Java solves this problem not in the way most languages do. Most languages realize xxx has to outlive the method and 'hoist it onto the heap'. Java doesn't do this (after all, things that live on heap, or can escape via lambdas to other threads, all of a sudden need to answer hairy questions such as "does that mean local variables can now be marked volatile?"). Instead, java just clones it. Because clones would get real confusing if code also modifies the variable (which of the clones are you modifying?), your code wouldn't even compile unless xxx is final! - the one 'gift' java gives you is that if your variable could have been marked as final without causing issues, i.e. you only write to it once, then you can omit final and java will just assume it. Try it! Take that code and reassign xxx anywhere (in the lambda or outside of it; does not matter), and you'll get an error that you can't use xxx in the lambda at all as it is not 'effectively final' (the java spec's term for "either marked final, or could have been so marked without issue").

So now it's simply: xxx in the context of the method itself remains a stack-allocated variable as normal. And the lambda gets a clone of it, which.. yes, lives on heap. There is an object representing this lambda that contains its captured state. Javac knows that only the variables actually used by the lambda need capturing (which may include this!), and captures only what is needed.

Lambdas which capture zero state are special in that they don't need an object at all to represent their state. Instead, those end up, effectively, as static methods and the lambda exists as a MethodRef object at the class/JVM level.

Regardless of whether your lambda has state or not, you should never use the object identity of one; the JVM spec says your code won't straight up crash but whether that identity has a meaningful uniqueness is not guaranteed. In other words, don't use them as keys in maps, don't lock on them (synchronized). Don't == them.

huangapple
  • 本文由 发表于 2023年3月15日 18:48:37
  • 转载请务必保留本文链接:https://go.coder-hub.com/75743660.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定