英文:
Implicit left-padding of the binary literal in Java
问题
当我构造掩码以获取2的补码格式中的最高有效位时,我发现了意外的行为。
要检查有符号8位数中的最高有效位是否处于活动状态,可以按如下方式获取位。
byte value = -1;
long byteSignMask = 0b1000_0000;
value & byteSignMask;
无论我使用0b1000_0000
还是1L << 7
来表示byteSignMask
,结果都是相同的。实际上,以下代码通过了测试。
long byteSign1 = 1L << 7;
long byteSign2 = 0b1000_0000;
// OK
assertEquals(byteSign1, byteSign2);
但是我尝试对int
类型执行类似操作时,结果是预期的。
long intSign1 = 1L << 31;
long intSign2 = 0b1000_0000_0000_0000_0000_0000_0000_0000;
// 失败: 期望:<2147483648> 但是实际是:<-2147483648>
assertEquals(intSign1, intSign2);
实际上,它们是不同的。
// intSign1 = 10000000000000000000000000000000
System.out.println("intSign1 = " + Long.toBinaryString(intSign1));
// intSign2 = 1111111111111111111111111111111110000000000000000000000000000000
System.out.println("intSign2 = " + Long.toBinaryString(intSign2));
看起来整数的字面掩码(intSign1
)会左填充1,而位移操作不会引起这种效果。
为什么二进制字面量表示的整数会自动左填充1?是否有官方文档描述这种行为?
英文:
When I constructed the mask to get the most significant bit in the 2's complement format, I have found the unexpected behavior.
To check whether the most significant bit is active or not in the signed 8-bit number, I could get the bit as follows.
byte value = -1;
long byteSignMask = 0b1000_0000;
value & byteSignMask;
The result is identical regardless I use 0b1000_0000
or 1L << 7
for byteSignMask
. Actually following code passes.
long byteSign1 = 1L << 7;
long byteSign2 = 0b1000_0000;
// OK
assertEquals(byteSign1, byteSign2);
But I did for the int type; similarly, the outcome was expected.
long intSign1 = 1L << 31;
long intSign2 = 0b1000_0000_0000_0000_0000_0000_0000_0000;
// Fail: expected:<2147483648> but was:<-2147483648>
assertEquals(intSign1, intSign2);
Actually, they are different.
// intSign1 = 10000000000000000000000000000000
System.out.println("intSign1 = " + Long.toBinaryString(intSign1));
// intSign2 = 1111111111111111111111111111111110000000000000000000000000000000
System.out.println("intSign2 = " + Long.toBinaryString(intSign2));
It looks like the literal mask of the integer (intSign1
) is left-padded with 1, while the shift operation does not cause such an effect.
Why is the integer expressed by the binary literal automatically left-padded with 1? Is there any official documentation describing this behavior?
答案1
得分: 2
以下是您要翻译的内容:
intSign2
you have here:
0b1000_0000_0000_0000_0000_0000_0000_0000
Is an int
literal, not a long
literal.
So you are saying "I want the int
value represented by this bit pattern".
A single 1
followed by 31 0
s represented as a 32 bit two's complement signed integer, int
, is -2147483648. This value then gets "widened" to a long
when you assigned to the long
type variable intSign2
. That's where the padded 1s came from.
To make it a long
literal, you would have to add a L
suffix:
0b1000_0000_0000_0000_0000_0000_0000_0000L
Why is
byteSign2
padded with left 0s, whileintSign2
is padded with left 1s?
When you specify a binary integer literal, and the number of bits you specify is fewer than the bit size of the data type, it will always get left-padded with 0s. So in the case of byteSign2
, you said 0b1000_0000
, which is actually equivalent to this binary literal:
0b0000_0000_0000_0000_0000_0000_1000_0000
In the case of intSign2
, you specified the full 32 bits of int
, so no padding is done at all.
The left-padded 1s are a result of the int
-to-long
conversion that took place. According to the language specification, this conversion works like this:
A widening conversion of a signed integer value to an integral type T simply sign-extends the two's-complement representation of the integer value to fill the wider format.
Because the conversion "sign-extends", it will pad 1s if the sign bit is 1, and 0s if the sign bit is 0 (this preserves the sign of the number, negative numbers remain negative, etc). For your binary literal, the sign bit is 1, so it pads 1s.
英文:
intSign2
you have here:
0b1000_0000_0000_0000_0000_0000_0000_0000
Is an int
literal, not a long
literal.
So you are saying "I want the int
value represented by this bit pattern".
A single 1
followed by 31 0
s represented as a 32 bit two's complement signed integer, int
, is -2147483648. This value then gets "widened" to a long
when you assigned to the long
type variable intSign2
. That's where the padded 1s came from.
To make it a long
literal, you would have to add a L
suffix:
0b1000_0000_0000_0000_0000_0000_0000_0000L
<hr>
> Why is byteSign2
padded with left 0s, while intSign2
is padded with left 1s?
When you specify a binary integer literal, and the number of bits you specify is fewer than the bit size of the data type, it will always get left-padded with 0s. So in the case of byteSign2
, you said 0b1000_0000
, which is actually equivalent to this binary literal:
0b0000_0000_0000_0000_0000_0000_1000_0000
In the case of intSign2
, you specified the full 32 bits of int
, so no padding is done at all.
The left-padded 1s are a result of the int
-to-long
conversion that took place. According to the language specification, this conversion works like this:
> A widening conversion of a signed integer value to an integral type T simply sign-extends the two's-complement representation of the integer value to fill the wider format.
Because the conversion "sign-extends", it will pad 1s if the sign bit is 1, and 0s if the sign bit is 0 (this preserves the sign of the number, negative numbers remain negative, etc). For your binary literal, the sign bit is 1, so it pads 1s.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论