为什么JVM在字符串池内存中无法“看到”重复的字符串值?

huangapple go评论79阅读模式
英文:

Why JVM is not "seeing" duplicate String value in String Pool memory?

问题

这可能是一个重复的问题,但我找不到我需要的解释。有人能解释一下吗?

如果我理解正确:

String s1 = "a";
String s2 = "a";

s1 和 s2 都将指向字符串池中的同一地址,并且只有一个值为 "a" 的对象。

现在,如果我这样做:

String s1 = "a"; //第 A 行
s1 = s1.concat("b"); //第 B 行;我不明白在引用方面发生了什么
String s2 = "ab"; //第 C 行
System.out.println(s1 == s2); //false

为什么我得到 false

我看待这个问题的方式(可能是错误的)如下:

第 A 行之后 -> 在字符串池中创建了一个对象(值为 a),由 s1 引用;
第 B 行之后 -> 在字符串池中创建了一个值为 b 的对象(没有引用),然后在字符串池中创建了一个值为 ab 的新对象(由现有的 s1 引用);
第 C 行之后 ->(这可能是我错了)由于现有的值为 ab 的对象(由 s1 引用)是使用 concat()(或 +)创建的,JVM 不会重用这个对象来被引用为 s2,而是只会在字符串池中创建一个新的值为 ab 的对象,由 s2 引用;

我错在哪里?

英文:

Possibly this is a duplicate, but couldn't find the explanation I needed. Can someone explain me this.

If I'm correct:

String s1 = "a";
String s2 = "a";

Both s1 and s2 will point to the same address in String pool and there will be only one object with value "a".

Now, if I do this:

String s1 = "a"; //line A
s1 = s1.concat("b"); //line B; I don't understand what's happening here in terms of references
String s2 = "ab";//line C
System.out.println(s1 == s2); //false

Why do I get false?

My way of seeing this (probably wrong), goes like this:

after line A -> object is created (with value a) in String pool, referenced by s1;
after line B -> object with value b is created in String pool (with no reference), then new object with value ab is created in String pool (referenced by existing s1)
after line C -> (this is probably I'm wrong) Since existing object with value ab (referenced by s1) is created using concat() (or +), JVM will not reuse this object to be pointed by reference s2, it will rather just create new object in String pool with value ab pointed by reference s2;

Where am I wrong?

答案1

得分: 5

以下是翻译好的部分:

TL;DR - the point of your confusion is the Java memory model for Strings, namely the Heap and String Constant Pool areas of the memory.

简而言之 - 你困惑的重点是Java字符串的内存模型,即内存中的Heap(堆)和String Constant Pool(字符串常量池)区域。


Deep Dive into String memory model

Design Motivation

在Java中,String(字符串)可能是最常用的对象。因此,Java使用特殊的内存设计策略来维护String对象,将它们存储在Heap(堆)中,或者存储在堆的一个隔离子集中,称为String Constant Pool(字符串常量池)中。

String Constant Pool是堆内存中的一个特殊空间,用于存储具有唯一“literal value”(字面值)的String对象。每当你创建一个具有字面值的String时,JVM首先检查字符串池中是否已有相同值的对象,如果有,就返回对同一对象的引用,如果没有,则在String Constant Pool中分配新对象,对于所有其他String字面值的创建也是如此。

之所以有Constant Pool的好处,是因为这个短语本身的语义 - 它存储了constant(常量)和immutable(不可变)的String对象,正如你所看到的,在创建具有相同字面内容的许多String对象的情况下,每次只引用一个对象,而不会为现有的String字面值对象创建新的对象。

请注意,这仅在String被定义为不可变时才可能发生。还要注意,最初为空的字符串池是由String类私有地维护的。

Java将String对象放在哪里?

现在事情变得有趣了。要记住的重要一点是,每当使用new String()指令创建String对象时,都会强制Java将新对象分配到Heap(堆)中;但是,如果使用"string literal"创建String对象,它将分配到String Constant Pool(字符串常量池)中。正如我们所说,String Constant Pool的存在主要是为了减少内存使用量,提高内存中现有String对象的重复使用。

因此,如果你编写以下代码:

String s1 = "a";
String s2 = "a";
String s3 = new String("a");
  1. String对象将在String Constant Pool中创建,并将对该对象的引用存储在变量s1中;
  2. 将查找String Constant Pool,由于在池中找到了相同字面值("a")的对象,将返回对同一对象的引用;
  3. 将在Heap(堆)区域上显式创建String对象,并返回引用并存储在变量s3中。

Internig Strings(字符串内部化)

如果你希望将使用new操作符创建的String对象移到String Constant Pool中,你可以调用"your_string_text".intern();方法,会发生以下两种情况之一:

  1. 如果池中已经包含与此String对象相等的字符串(由equals(Object)方法确定),则将返回池中的字符串;
  2. 否则,将此String对象添加到池中,并返回对此String对象的引用。

What happens in your code?(你的代码中发生了什么?)

String s1 = "a";
String s2 = "a";

s1和s2将指向字符串池中的相同地址,只会有一个值为"a"的对象。

正确。最初,String对象将被创建并放置在String Constant Pool中。之后,由于已经存在具有值"a"的String,对于s2不会创建新对象,对s1存储的引用也将类似地存储到s2中。

现在,让我们最终来看看你的问题:

String s1 = "a"; //在String Constant Pool中分配
s1 = s1.concat("b"); //concat()返回一个在Heap中分配的新String对象
String s2 = "ab";//"ab"仍然不存在于String Constant Pool中,它在那里分配
System.out.println(s1 == s2); //返回false,因为一个对象在Heap中,另一个在String Constant Pool中,并且因为池中已经存在具有相同值的对象,所以`intern()`将返回现有对象。

然而,如果你执行以下代码:

System.out.println(s1.intern() == s2);

这将返回true,希望你现在理解为什么了。因为intern()会将通过s1引用的对象从Heap(堆)移动到String Constant Pool(字符串常量池)。

英文:

TL;DR - the point of your confusion is the Java memory model for Strings, namely the Heap and String Constant Pool areas of the memory.


Deep Dive into String memory model

Design Motivation

In Java, String is probably the most heavily used object. Because of this, Java maintains String objects with a special memory design strategy, holding them either in the Heap, in the isolated subset of the heap called String Constant Pool, or in both.

String Constant Pool is a special space in the Heap memory, which holds String objects of the unique "literal value"s. Anytime you create a String with its literal value, JVM first checks if the object of the same value is available in the String pool, and if it is, reference to the same object is returned, if it doesn't - the new object is allocated in the String Constant Pool, and the same happens for all other String literal creations again and again.

Reason, why having the Constant Pool is a good idea, is the semantics of this phrase itself - because it stores the constant and immutable String objects, and as you see, this is a great idea for the occasions when you might be creating many String objects with the same literal content - in all those cases, only one object for one "literal value" will be referenced each time and no newer objects will be created for the existing String literal object.
>Note, that this is only possible because, String is immutable by definition. Also, note, that a pool of strings, which initially is empty, is maintained privately by the class String.

Where does Java place String objects?

Now this is where things get interesting. Important point to bear in mind, is that whenever you create String object with a new String() instruction, you force Java to allocate the new object into Heap; however, if you create a String object with the "string literal", it gets allocated in String Constant Pool. As we've said, the String Constant Pool exists mainly to reduce memory usage and improve the re-use of existing String objects in the memory.

So, if you'll write:

String s1 = "a";
String s2 = "a";
String s3 = new String("a");
  1. String object will be created into String Constant Pool and reference to that object will be stored into variable s1;
  2. String Constant Pool will be looked-up, and because of there is an object with the same literal value ("a") found in the pool, reference to the same object will be returned;
  3. String object will be explicitly created on the Heap area and the reference will be returned and stored into variable s3.

Internig Strings

If you wish to move the String object, created with new operator, into the String Constant Pool, you can invoke "your_string_text".intern(); method, and one of two will happen:

  1. if the pool already contains a string equal to this String object as determined by the equals(Object) method, then the string from the pool will be returned;
  2. otherwise, this String object will be added to the pool and a reference to this String object will be returned.

What happens in your code?

String s1 = "a";
String s2 = "a";

>Both s1 and s2 will point to the same address in String pool and there will be only one object with value "a".

True. Initially, String object will be created and it will be placed into String Constant Pool. After that, as there is already String with value "a", no new object will be created for s2 and reference stored in s1 will be similarly stored into s2.

Now, let's finally have a look at your question:

String s1 = "a"; //allocated in the String Constant Pool
s1 = s1.concat("b"); //contact() returns a new String object, allocated in the Heap
String s2 = "ab";//"ab" still does NOT exist in the String Constant Pool, and it gets allocated there
System.out.println(s1 == s2); //returns false, because one object is in the Heap, and another is in the String Constant Pool, and as there already exists the object in the pool, with the same value, existing object will be returned by `intern()`.

If you will, however, execute

System.out.println(s1.intern() == s2);

this will return true, and I hope, by now, you understand - why. Because intern() will move the object referenced via s1 from Heap to the String Constant Pool.

答案2

得分: 4

API文档

public String concat(String str)
...
如果参数字符串的长度为0,则返回此String对象。否则,将创建一个新的String对象,该对象表示由此String对象表示的字符序列与参数字符串表示的字符序列连接而成的字符序列。

英文:

From API Documentation

> public String concat(String str)
...
If the length of the argument string is 0, then this String object is returned. Otherwise, a new String object is created, representing a character sequence that is the concatenation of the character sequence represented by this String object and the character sequence represented by the argument string.

答案3

得分: 2

s1.concat("b")返回一个新的String,因此s1 == s2将为false

就像下面这样:

String a = "Hello";
String b = new String("Hello");

System.out.println(a == b); // false

Java 11中String#concat的源代码仅供参考:

public String concat(String str) {
    int olen = str.length();
    if (olen == 0) {
        return this;
    }
    if (coder() == str.coder()) {
        byte[] val = this.value;
        byte[] oval = str.value;
        int len = val.length + oval.length;
        byte[] buf = Arrays.copyOf(val, len);
        System.arraycopy(oval, 0, buf, val.length, oval.length);
        return new String(buf, coder);
    }
    int len = length();
    byte[] buf = StringUTF16.newBytesFor(len + olen);
    getBytes(buf, 0, UTF16);
    str.getBytes(buf, len, UTF16);
    return new String(buf, UTF16);
}
英文:

s1.concat("b") returns a new String and therefore s1 == s2 will be false.

It's like the following:

String a = "Hello";
String b = new String("Hello");

System.out.println(a == b); // false

Source code of String#concat in Java 11 just for reference:

public String concat(String str) {
    int olen = str.length();
    if (olen == 0) {
        return this;
    }
    if (coder() == str.coder()) {
        byte[] val = this.value;
        byte[] oval = str.value;
        int len = val.length + oval.length;
        byte[] buf = Arrays.copyOf(val, len);
        System.arraycopy(oval, 0, buf, val.length, oval.length);
        return new String(buf, coder);
    }
    int len = length();
    byte[] buf = StringUTF16.newBytesFor(len + olen);
    getBytes(buf, 0, UTF16);
    str.getBytes(buf, len, UTF16);
    return new String(buf, UTF16);
}

答案4

得分: 1

我认为Stringconcat()方法会实例化一个新的String("ab")对象,这与"ab"对象是不同的对象。

英文:

I suppose the String concat() method instantiates a new String("ab") which is a different object than the "ab" object.

huangapple
  • 本文由 发表于 2020年8月5日 01:37:32
  • 转载请务必保留本文链接:https://go.coder-hub.com/63252282.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定