英文:
Left Bit shift and casting
问题
I have a behavior that I don't understand, I try to construct a 64-bit integer from an array of bytes from big endian to little endian.
uint64_t u;
uint8_t bytes[2];
bytes[1] = 0xFF;
u = bytes[1] << 24;
dump_bytes_as_hex(&u, 8);
00 00 00 FF FF FF FF FF
u = ((uint16_t) bytes[1]) << 24;
dump_bytes_as_hex(&u, 8);
00 00 00 FF FF FF FF FF
u = ((uint32_t) bytes[1]) << 24;
dump_bytes_as_hex(&u, 8);
00 00 00 FF 00 00 00 00
I don't understand why it gives me the correct result only if I cast to a type that has more bits than the shift size. I have tried different values:
- 0xFF-1 gives the same bad result
- 100 gives the correct result without casting
So I wanted to know what the rule is, and why 100 gives me the correct value.
Edit:
Here is a reproducible example:
#include <stdio.h>
#include <stdint.h>
void dump_bytes_as_hex(uint8_t* b, int count)
{
FILE* f;
f = stdout;
for(int c = 0; c < count; ++c)
{
fprintf(f, "%02X", b[c]);
fputc(' ', f);
}
fputc('\n', f);
fflush(f);
}
void test(uint8_t i)
{
uint64_t u;
uint8_t bytes[2];
fprintf(stdout, "Test with %d\n", (int) i);
u = 0;
bytes[1] = i;
u = bytes[1] << 24;
dump_bytes_as_hex((uint8_t*) &u, 8);
u = ((uint16_t) bytes[1]) << 24;
dump_bytes_as_hex((uint8_t*) &u, 8);
u = ((uint32_t) bytes[1]) << 24;
dump_bytes_as_hex((uint8_t*) &u, 8);
fprintf(stdout, "\n\n");
}
int main()
{
test(0xFF);
test(0xFF - 1);
test(100);
return 0;
}
(Note: I have retained the code portion and translated the surrounding text as requested.)
英文:
I have a behaviour that i don't understand, i try to construct an 64 integer from an array of bytes from big endian to little endian.
uint64_t u;
uint8_t bytes[2];
bytes[1] = 0xFF;
u = bytes[1] << 24 ;
dump_bytes_as_hex( &u, 8 );
00 00 00 FF FF FF FF FF
u = ( (uint16_t) bytes[1]) << 24 ;
dump_bytes_as_hex( &u, 8 );
00 00 00 FF FF FF FF FF
u = ( (uint32_t) bytes[1]) << 24 ;
dump_bytes_as_hex( &u, 8 );
00 00 00 FF 00 00 00 00
I don't understand why it give me the correct result only if i cast to a type that has more bits than the shift size. I have tried different values :
- 0xFF-1 give the same bad result
- 100 give correct result without casting
So i wanted to know what is the rule ? and why 100 give me the correct value.
Thank you.
Edit :
Here is a reproductible example :
#include <stdio.h>
#include <stdint.h>
void dump_bytes_as_hex( uint8_t* b, int count )
{
FILE* f;
f = stdout;
for( int c = 0; c < count; ++c )
{
fprintf( f, "%02X", b[c] );
fputc( ' ', f );
}
fputc( '\n', f );
fflush( f );
}
void test( uint8_t i )
{
uint64_t u;
uint8_t bytes[2];
fprintf( stdout, "Test with %d\n", (int) i );
u = 0;
bytes[1] = i;
u = bytes[1] << 24 ;
dump_bytes_as_hex( (uint8_t*) &u, 8 );
u = ( (uint16_t) bytes[1]) << 24 ;
dump_bytes_as_hex( (uint8_t*) &u, 8 );
u = ( (uint32_t) bytes[1]) << 24 ;
dump_bytes_as_hex( (uint8_t*) &u, 8 );
fprintf( stdout, "\n\n");
}
int main()
{
test( 0xFF );
test( 0xFF -1 );
test( 100 );
return 0;
}
答案1
得分: 0
The type of the variable to which you assign the result, if any, is irrelevant.
It only works if the type of the left-hand operand is a type that, after integer promotion, can hold the result.
-
(uint8_t)0xFF << 24
在一个带有 16 到 32 位
int
类型的环境中,(uint8_t)0xFF
被提升为一个int
。0xFF × 2^24 在int
中无法容纳。由于是有符号类型,这会导致未定义的行为。在一个带有 33 位或更多位
int
类型的环境中,(uint8_t)0xFF
被提升为一个int
。0xFF × 2^24 可以容纳在int
中。这是有效的。 -
(uint16_t)0xFF << 24
在一个带有 16 到 24 位
int
类型的环境中,(uint16_t)0xFF
会产生一个 16 位的unsigned int
。左移位数大于或等于操作数大小的行为是未定义的。在一个带有 25 到 32 位
int
类型的环境中,(uint16_t)0xFF
会产生一个int
。0xFF × 2^24 在int
中无法容纳。由于是有符号类型,这会导致未定义的行为。在一个带有 33 位或更多位
int
类型的环境中,(uint16_t)0xFF
会产生一个int
。0xFF × 2^24 可以容纳在int
中。这是有效的。 -
(uint32_t)0xFF << 24
0xFF × 2^24 可以容纳在 32 位无符号整数中。这是有效的。
因此,要生成一个 uint64_t
,您可以使用以下方式:
(uint64_t)((uint64_t)u8 << 24)
或者
(uint64_t)((uint32_t)u8 << 24)
然而,外部的强制类型转换在这里可以省略,因为赋值将隐式执行这种转换。
英文:
The type of the variable to which you assign the result, if any, is irrelevant.
It only works if the type of the left-hand operand is a type that, after integer promotion, can hold the result.
-
(uint8_t)0xFF << 24
In an environment with an
int
type of 16..32 bits,(uint8_t)0xFF
is promoted to anint
. 0xFF × 2<sup>24</sup> is too large to hold in anint
. Being a signed type, this results in undefined behaviour.In an environment with an
int
type of 33+ bits,(uint8_t)0xFF
is promoted to anint
. 0xFF × 2<sup>24</sup> fits in anint
. This works. -
(uint16_t)0xFF << 24
In an environment with an
int
type of 16..24 bits,(uint16_t)0xFF
results in a 16-bitunsigned int
. Left-shifting by an amount of bits greater than or equal to the size of the operand is undefined behaviour.In an environment with an
int
type of 25..32 bits,(uint16_t)0xFF
results in anint
. 0xFF × 2<sup>24</sup> is too large to hold in anint
. Being a signed type, this results in undefined behaviour.In an environment with an
int
type of 33+ bits,(uint16_t)0xFF
results in anint
. 0xFF × 2<sup>24</sup> fits in anint
. This works. -
(uint32_t)0xFF << 24
0xFF × 2<sup>24</sup> fits in a 32-bit unsigned integer. This works.
So, to produce a uint64_t
, you want
(uint64_t)( (uint64_t)u8 << 24 )
or
(uint64_t)( (uint32_t)u8 << 24 )
However, the outer cast can be dropped here since the assignment will implicitly perform this cast.
答案2
得分: 0
"所以我想知道规则是什么?"
实际上,对于你的特定示例,没有规则…
u = bytes[1] << 24;
…以及…
dump_bytes_as_hex(&u, 8);
…在你的int
是32位宽度的情况下。
位移操作的左操作数受通常的算术转换的影响,这将导致将左操作数的8位或16位无符号值转换为(有符号)int
并生成该类型的结果。如果有符号类型(即int
)的值左移的算术结果不能表示为该类型的值,那么行为是未定义的。这在这里是这种情况(这就是为什么我说没有规则的原因)。
在你的情况下,似乎你的实现将左操作数重新解释为无符号32位值,然后重新解释结果为(有符号)int。然后,将该值分配给类型uint64_t
继续(通常)通过将负的右操作数转换为uint64_t
,通过添加2^64将其带入该类型的范围。 (这种转换仅适用于转换为无符号整数类型,不适用于有符号整数类型。)你正在使用小端系统并按内存顺序打印出结果的字节,所以你得到:
"00 00 00 FF FF FF FF FF"
另一方面,如果你在你的32位系统上将位移操作数转换为uint32_t
…
dump_bytes_as_hex(&u, 8);
…那么结果类型不受标准算术转换的影响,转换的结果具有该类型。 uint32_t
可以表示位移的算术结果(0xff000000
),所以这实际上是结果。当将其转换为类型uint64_t
进行分配时,该值保持不变。
一般建议:在使用位操作时使用无符号类型,特别是在执行位移操作时。
英文:
> So i wanted to know what is the rule ?
In fact there is no rule for your specific examples ...
> bytes[1] = 0xFF;
> u = bytes[1] << 24 ;
... and ...
> u = ( (uint16_t) bytes[1]) << 24 ;
> dump_bytes_as_hex( &u, 8 );
... in the event that your int
is 32 bits wide.
The left operand of a shift operation is is subject to the usual arithmetic conversions, which will have the effect of converting the 8- or 16-bit unsigned value of the left operand to (signed
) int
and producing a result of that type. If the arithmetic result of a left shift of a value of signed type (i.e. int
) is not representable as a value of that type then the behavior is undefined. That is the case here (which is why I said there is no rule).
In your particular case, it appears that your implementation is performing the shift as if by reinterpreting the left operand as an unsigned 32-bit value, the reinterpreting the result as a (signed) int. The assignment to type uint64_t
then proceeds (normally) by converting the negative right operand to ty uint64_t
by adding 2<sup>64</sup> to bring it into the range of that type. (Such conversions are performed only for conversion to unsigned integer types, not to signed ones.) You are working on a little-endian system and printing out the resulting bytes in memory order, so you get:
> 00 00 00 FF FF FF FF FF
On the other hand, if you convert the shift operand to uint32_t
on your 32-0bit system, ...
> u = ( (uint32_t) bytes[1]) << 24 ;
> dump_bytes_as_hex( &u, 8 );
... then the resulting type is unaffected by the standard arithmetic conversions, and the result of the conversion has that type. uint32_t
can represent the arithmetic result of the shift (0xff000000
), so that is in fact the result. That value is unchanged when converted to type uint64_t
for the assignment.
General advice: use unsigned types when working with bitwise operations, and especially when performing shifts.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论