Java中的二进制字面量的隐式左填充

huangapple go评论75阅读模式
英文:

Implicit left-padding of the binary literal in Java

问题

当我构造掩码以获取2的补码格式中的最高有效位时,我发现了意外的行为。

要检查有符号8位数中的最高有效位是否处于活动状态,可以按如下方式获取位。

byte value = -1;
long byteSignMask = 0b1000_0000;
value & byteSignMask;

无论我使用0b1000_0000还是1L << 7来表示byteSignMask,结果都是相同的。实际上,以下代码通过了测试。

long byteSign1 = 1L << 7;
long byteSign2 = 0b1000_0000;
// OK
assertEquals(byteSign1, byteSign2);

但是我尝试对int类型执行类似操作时,结果是预期的。

long intSign1 = 1L << 31;
long intSign2 = 0b1000_0000_0000_0000_0000_0000_0000_0000;

// 失败: 期望:<2147483648> 但是实际是:<-2147483648>
assertEquals(intSign1, intSign2);

实际上,它们是不同的。

// intSign1 = 10000000000000000000000000000000
System.out.println("intSign1 = " + Long.toBinaryString(intSign1));
// intSign2 = 1111111111111111111111111111111110000000000000000000000000000000
System.out.println("intSign2 = " + Long.toBinaryString(intSign2));

看起来整数的字面掩码(intSign1)会左填充1,而位移操作不会引起这种效果。

为什么二进制字面量表示的整数会自动左填充1?是否有官方文档描述这种行为?

英文:

When I constructed the mask to get the most significant bit in the 2's complement format, I have found the unexpected behavior.

To check whether the most significant bit is active or not in the signed 8-bit number, I could get the bit as follows.

byte value = -1;
long byteSignMask = 0b1000_0000;
value &amp; byteSignMask;

The result is identical regardless I use 0b1000_0000 or 1L &lt;&lt; 7 for byteSignMask. Actually following code passes.

long byteSign1 = 1L &lt;&lt; 7;
long byteSign2 = 0b1000_0000;
// OK
assertEquals(byteSign1, byteSign2);

But I did for the int type; similarly, the outcome was expected.

long intSign1 = 1L &lt;&lt; 31;
long intSign2 = 0b1000_0000_0000_0000_0000_0000_0000_0000;

// Fail: expected:&lt;2147483648&gt; but was:&lt;-2147483648&gt;
assertEquals(intSign1, intSign2);

Actually, they are different.

// intSign1 = 10000000000000000000000000000000
System.out.println(&quot;intSign1 = &quot; + Long.toBinaryString(intSign1));
// intSign2 = 1111111111111111111111111111111110000000000000000000000000000000
System.out.println(&quot;intSign2 = &quot; + Long.toBinaryString(intSign2));

It looks like the literal mask of the integer (intSign1) is left-padded with 1, while the shift operation does not cause such an effect.

Why is the integer expressed by the binary literal automatically left-padded with 1? Is there any official documentation describing this behavior?

答案1

得分: 2

以下是您要翻译的内容:

intSign2 you have here:

0b1000_0000_0000_0000_0000_0000_0000_0000

Is an int literal, not a long literal.

So you are saying "I want the int value represented by this bit pattern".

A single 1 followed by 31 0s represented as a 32 bit two's complement signed integer, int, is -2147483648. This value then gets "widened" to a long when you assigned to the long type variable intSign2. That's where the padded 1s came from.

To make it a long literal, you would have to add a L suffix:

0b1000_0000_0000_0000_0000_0000_0000_0000L

Why is byteSign2 padded with left 0s, while intSign2 is padded with left 1s?

When you specify a binary integer literal, and the number of bits you specify is fewer than the bit size of the data type, it will always get left-padded with 0s. So in the case of byteSign2, you said 0b1000_0000, which is actually equivalent to this binary literal:

0b0000_0000_0000_0000_0000_0000_1000_0000

In the case of intSign2, you specified the full 32 bits of int, so no padding is done at all.

The left-padded 1s are a result of the int-to-long conversion that took place. According to the language specification, this conversion works like this:

A widening conversion of a signed integer value to an integral type T simply sign-extends the two's-complement representation of the integer value to fill the wider format.

Because the conversion "sign-extends", it will pad 1s if the sign bit is 1, and 0s if the sign bit is 0 (this preserves the sign of the number, negative numbers remain negative, etc). For your binary literal, the sign bit is 1, so it pads 1s.

英文:

intSign2 you have here:

0b1000_0000_0000_0000_0000_0000_0000_0000

Is an int literal, not a long literal.

So you are saying "I want the int value represented by this bit pattern".

A single 1 followed by 31 0s represented as a 32 bit two's complement signed integer, int, is -2147483648. This value then gets "widened" to a long when you assigned to the long type variable intSign2. That's where the padded 1s came from.

To make it a long literal, you would have to add a L suffix:

0b1000_0000_0000_0000_0000_0000_0000_0000L

<hr>

> Why is byteSign2 padded with left 0s, while intSign2 is padded with left 1s?

When you specify a binary integer literal, and the number of bits you specify is fewer than the bit size of the data type, it will always get left-padded with 0s. So in the case of byteSign2, you said 0b1000_0000, which is actually equivalent to this binary literal:

0b0000_0000_0000_0000_0000_0000_1000_0000

In the case of intSign2, you specified the full 32 bits of int, so no padding is done at all.

The left-padded 1s are a result of the int-to-long conversion that took place. According to the language specification, this conversion works like this:

> A widening conversion of a signed integer value to an integral type T simply sign-extends the two's-complement representation of the integer value to fill the wider format.

Because the conversion "sign-extends", it will pad 1s if the sign bit is 1, and 0s if the sign bit is 0 (this preserves the sign of the number, negative numbers remain negative, etc). For your binary literal, the sign bit is 1, so it pads 1s.

huangapple
  • 本文由 发表于 2020年8月5日 13:59:18
  • 转载请务必保留本文链接:https://go.coder-hub.com/63259242.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定