Java的String构造函数是如何实现的?

huangapple go评论86阅读模式
英文:

How is Java String constructor implemented?

问题

我在查阅 Java 字符串源代码时,发现了一个构造函数,对其中的一些内容存在疑问:

public String(String original) {
    this.value = original.value;
    this.hash = original.hash;
}

我理解原始的字符串作为字面值的形式(带有双引号),但是我不太明白 Java/JVM 是如何将 original.value 计算为字符数组的。这里的 "value" 是什么?如果 "value" 是字符数组,那么 .value 这个函数/字段是如何计算的?

英文:

I was going through Java String source code, found a CTOR where I have some doubts:

public String(String original) {
    this.value = original.value;
    this.hash = original.hash;
}

I understood that Original is acting as literal String (with double quotes) but not able to understand how is java/jvm computed original.value as char Array. What is "value" over here?? if value is char Array then how .value function/Field is calculated???

答案1

得分: 2

是的,在评论中已经提到过,这很简单。
由于您正在查看String类本身 - 它可以访问自己的字段。而这些字段是存储在一个char数组中的,构成给定字符串的字符实际上存储在那里。该方法只是通过名称引用字段,非常基本的交互。

英文:

Yes, as already mentioned in the comments, this is very simple.
Since you're looking at the String class itself — it has access to its own fields. And that is where the characters this given string consists of are actually stored — in a char array. This method simply references the field by name, very basic interaction.

答案2

得分: 1

【docs】1 表示

> 初始化一个新创建的 String 对象,使其表示与参数相同的字符序列;换句话说,新创建的字符串是参数字符串的副本。除非需要原始字符串的显式副本,否则没必要使用此构造函数,因为字符串是不可变的。

从技术上讲,新的 String 将获得 originalvaluehash

Java的String构造函数是如何实现的?2

这意味着这是另一个字符串的副本。

英文:

The docs says

> Initializes a newly created String object so that it represents the
> same sequence of characters as the argument; in other words, the newly
> created string is a copy of the argument string. Unless an explicit
> copy of original is needed, use of this constructor is unnecessary
> since Strings are immutable.

Technically the new String will get the value and hash of the original.

Java的String构造函数是如何实现的?

which means that this is a copy of another String.

答案3

得分: 1

字符串按设计保存Unicode文本,因此所有语言脚本可以组合在一起。
为此,实现使用一个数组(字段名为value)来保存,其中每个字符是一个两字节的UTF-16值。

您遇到了Java类中仅有的愚蠢之处。

所示的拷贝构造函数是无意义的,因为字符串是不可变对象,
并且它们可以通过简单赋值进行共享。这是C++继承的化石,
可能与字符串驻留(String interning)有关。

进行拷贝是没有意义的。这同样适用于内部的char数组,实际上可以通过引用进行赋值。(不太一致。)

因此,以下示例展示了不熟练的Java用法:

String s = new String(t);

在最新的Java版本中,字符串的值实际上可能是某种编码的字节数组,因此字符可能会被延迟提供。


关于字符串字面值:

字符串字面值存储在名为_constant pool_的数据结构中,位于.class文件中。以UTF-8字节存储。JVM类加载器确保字符串被加载为String。

final static String常量的导入会被复制到常量池中,原始类可能不再显示为已导入的类。
在另一个类中保存字符串常量可能需要手动进行一次干净的构建,因为可能不再存在类依赖关系。

英文:

String by design holds Unicode text, so all language scripts may be combined.
For that the implementation holds an array (field name value), where every char is a two byte UTF-16 value.

You encountered the one and AFAIK only silly point in the Java classes.

The shown copy constructor is senseless, as Strings are immutable objects,
and they may be shared by simple assignment. It is a fossile of C++ inheritance,
maybe in association with String interning.

To make a copy is senseless. This holds too for the internal char array, which indeed may be assigned by reference. (Not very consequent.)

So the following shows inexperienced java usage:

String s = new String(t);

With the newest java versions, the value of a String might actually be a byte array in some encoding, and hence the chars are lazily provided.


About String literals:

String literals are stored in a datastructure in a .class file called the constant pool. Stored is it as UTF-8 bytes. The JVM ClassLoader ensures that the string is loaded as String.

Imports of final static String constants are copied into the constant pool, and the original class may no longer appear as being imported from.
Holding string constant in an other class may require manually doing a clean build, as there might no longer exist a class dependency.

huangapple
  • 本文由 发表于 2020年3月16日 20:31:54
  • 转载请务必保留本文链接:https://go.coder-hub.com/60706090.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定