左位移和类型转换

huangapple go评论58阅读模式
英文:

Left Bit shift and casting

问题

I have a behavior that I don't understand, I try to construct a 64-bit integer from an array of bytes from big endian to little endian.

uint64_t u;
uint8_t bytes[2];

bytes[1] = 0xFF;
u = bytes[1] << 24;
dump_bytes_as_hex(&u, 8);

00 00 00 FF FF FF FF FF

u = ((uint16_t) bytes[1]) << 24;
dump_bytes_as_hex(&u, 8);

00 00 00 FF FF FF FF FF

u = ((uint32_t) bytes[1]) << 24;
dump_bytes_as_hex(&u, 8);

00 00 00 FF 00 00 00 00

I don't understand why it gives me the correct result only if I cast to a type that has more bits than the shift size. I have tried different values:

  • 0xFF-1 gives the same bad result
  • 100 gives the correct result without casting

So I wanted to know what the rule is, and why 100 gives me the correct value.

Edit:

Here is a reproducible example:

#include <stdio.h>
#include <stdint.h>

void dump_bytes_as_hex(uint8_t* b, int count)
{
    FILE* f;

    f = stdout;

    for(int c = 0; c < count; ++c)
    {
        fprintf(f, "%02X", b[c]);
        fputc(' ', f);
    }
    fputc('\n', f);
    fflush(f);
}

void test(uint8_t i)
{
    uint64_t u;
    uint8_t bytes[2];

    fprintf(stdout, "Test with %d\n", (int) i);

    u = 0;
    bytes[1] = i;

    u = bytes[1] << 24;
    dump_bytes_as_hex((uint8_t*) &u, 8);

    u = ((uint16_t) bytes[1]) << 24;
    dump_bytes_as_hex((uint8_t*) &u, 8);

    u = ((uint32_t) bytes[1]) << 24;
    dump_bytes_as_hex((uint8_t*) &u, 8);

    fprintf(stdout, "\n\n");
}

int main()
{
    test(0xFF);
    test(0xFF - 1);
    test(100);

    return 0;
}

(Note: I have retained the code portion and translated the surrounding text as requested.)

英文:

I have a behaviour that i don't understand, i try to construct an 64 integer from an array of bytes from big endian to little endian.

uint64_t u;
uint8_t bytes[2];


bytes[1] = 0xFF;
u =  bytes[1] &lt;&lt; 24 ;
dump_bytes_as_hex( &amp;u, 8 );

00 00 00 FF FF FF FF FF

u =  ( (uint16_t) bytes[1]) &lt;&lt; 24 ;
dump_bytes_as_hex( &amp;u, 8 );

00 00 00 FF FF FF FF FF

u =  ( (uint32_t) bytes[1]) &lt;&lt; 24 ;
dump_bytes_as_hex( &amp;u, 8 );

00 00 00 FF 00 00 00 00

I don't understand why it give me the correct result only if i cast to a type that has more bits than the shift size. I have tried different values :

  • 0xFF-1 give the same bad result
  • 100 give correct result without casting

So i wanted to know what is the rule ? and why 100 give me the correct value.

Thank you.

Edit :

Here is a reproductible example :

#include &lt;stdio.h&gt;
#include &lt;stdint.h&gt;


void dump_bytes_as_hex( uint8_t* b, int count )
{
    FILE* f;

    f = stdout;

    for( int c = 0; c &lt; count; ++c )
    {
        fprintf( f, &quot;%02X&quot;, b[c] );
        fputc( &#39; &#39;, f );
    }
    fputc( &#39;\n&#39;, f );
    fflush( f );
}

void test( uint8_t i )
{
    uint64_t u;
    uint8_t bytes[2];

    fprintf( stdout, &quot;Test with %d\n&quot;, (int) i );

    u = 0;
    bytes[1] = i;

    u =  bytes[1] &lt;&lt; 24 ;
    dump_bytes_as_hex( (uint8_t*) &amp;u, 8 );

    u =  ( (uint16_t) bytes[1]) &lt;&lt; 24 ;
    dump_bytes_as_hex( (uint8_t*) &amp;u, 8 );

    u =  ( (uint32_t) bytes[1]) &lt;&lt; 24 ;
    dump_bytes_as_hex( (uint8_t*) &amp;u, 8 );

    fprintf( stdout, &quot;\n\n&quot;);
}



int main()
{

    test( 0xFF );
    test( 0xFF -1  );
    test( 100 );

    return 0;

}

答案1

得分: 0

The type of the variable to which you assign the result, if any, is irrelevant.

It only works if the type of the left-hand operand is a type that, after integer promotion, can hold the result.

  • (uint8_t)0xFF << 24

    在一个带有 16 到 32 位 int 类型的环境中,(uint8_t)0xFF 被提升为一个 int。0xFF × 2^24 在 int 中无法容纳。由于是有符号类型,这会导致未定义的行为。

    在一个带有 33 位或更多位 int 类型的环境中,(uint8_t)0xFF 被提升为一个 int。0xFF × 2^24 可以容纳在 int 中。这是有效的。

  • (uint16_t)0xFF << 24

    在一个带有 16 到 24 位 int 类型的环境中,(uint16_t)0xFF 会产生一个 16 位的 unsigned int。左移位数大于或等于操作数大小的行为是未定义的。

    在一个带有 25 到 32 位 int 类型的环境中,(uint16_t)0xFF 会产生一个 int。0xFF × 2^24 在 int 中无法容纳。由于是有符号类型,这会导致未定义的行为。

    在一个带有 33 位或更多位 int 类型的环境中,(uint16_t)0xFF 会产生一个 int。0xFF × 2^24 可以容纳在 int 中。这是有效的。

  • (uint32_t)0xFF << 24

    0xFF × 2^24 可以容纳在 32 位无符号整数中。这是有效的。

因此,要生成一个 uint64_t,您可以使用以下方式:

(uint64_t)((uint64_t)u8 << 24)

或者

(uint64_t)((uint32_t)u8 << 24)

然而,外部的强制类型转换在这里可以省略,因为赋值将隐式执行这种转换。

英文:

The type of the variable to which you assign the result, if any, is irrelevant.

It only works if the type of the left-hand operand is a type that, after integer promotion, can hold the result.

  • (uint8_t)0xFF &lt;&lt; 24

    In an environment with an int type of 16..32 bits, (uint8_t)0xFF is promoted to an int. 0xFF &times; 2<sup>24</sup> is too large to hold in an int. Being a signed type, this results in undefined behaviour.

    In an environment with an int type of 33+ bits, (uint8_t)0xFF is promoted to an int. 0xFF &times; 2<sup>24</sup> fits in an int. This works.

  • (uint16_t)0xFF &lt;&lt; 24

    In an environment with an int type of 16..24 bits, (uint16_t)0xFF results in a 16-bit unsigned int. Left-shifting by an amount of bits greater than or equal to the size of the operand is undefined behaviour.

    In an environment with an int type of 25..32 bits, (uint16_t)0xFF results in an int. 0xFF &times; 2<sup>24</sup> is too large to hold in an int. Being a signed type, this results in undefined behaviour.

    In an environment with an int type of 33+ bits, (uint16_t)0xFF results in an int. 0xFF &times; 2<sup>24</sup> fits in an int. This works.

  • (uint32_t)0xFF &lt;&lt; 24

    0xFF &times; 2<sup>24</sup> fits in a 32-bit unsigned integer. This works.

So, to produce a uint64_t, you want

(uint64_t)( (uint64_t)u8 &lt;&lt; 24 )

or

(uint64_t)( (uint32_t)u8 &lt;&lt; 24 )

However, the outer cast can be dropped here since the assignment will implicitly perform this cast.

答案2

得分: 0

"所以我想知道规则是什么?"

实际上,对于你的特定示例,没有规则…

u = bytes[1] << 24;

…以及…

dump_bytes_as_hex(&u, 8);

…在你的int是32位宽度的情况下。

位移操作的左操作数受通常的算术转换的影响,这将导致将左操作数的8位或16位无符号值转换为(有符号)int并生成该类型的结果。如果有符号类型(即int)的值左移的算术结果不能表示为该类型的值,那么行为是未定义的。这在这里是这种情况(这就是为什么我说没有规则的原因)。

在你的情况下,似乎你的实现将左操作数重新解释为无符号32位值,然后重新解释结果为(有符号)int。然后,将该值分配给类型uint64_t继续(通常)通过将负的右操作数转换为uint64_t,通过添加2^64将其带入该类型的范围。 (这种转换仅适用于转换为无符号整数类型,不适用于有符号整数类型。)你正在使用小端系统并按内存顺序打印出结果的字节,所以你得到:

"00 00 00 FF FF FF FF FF"

另一方面,如果你在你的32位系统上将位移操作数转换为uint32_t

dump_bytes_as_hex(&u, 8);

…那么结果类型不受标准算术转换的影响,转换的结果具有该类型。 uint32_t可以表示位移的算术结果(0xff000000),所以这实际上是结果。当将其转换为类型uint64_t进行分配时,该值保持不变。


一般建议:在使用位操作时使用无符号类型,特别是在执行位移操作时。

英文:

> So i wanted to know what is the rule ?

In fact there is no rule for your specific examples ...

> bytes[1] = 0xFF;
> u = bytes[1] << 24 ;

... and ...

> u = ( (uint16_t) bytes[1]) << 24 ;
> dump_bytes_as_hex( &u, 8 );

... in the event that your int is 32 bits wide.

The left operand of a shift operation is is subject to the usual arithmetic conversions, which will have the effect of converting the 8- or 16-bit unsigned value of the left operand to (signed) int and producing a result of that type. If the arithmetic result of a left shift of a value of signed type (i.e. int) is not representable as a value of that type then the behavior is undefined. That is the case here (which is why I said there is no rule).

In your particular case, it appears that your implementation is performing the shift as if by reinterpreting the left operand as an unsigned 32-bit value, the reinterpreting the result as a (signed) int. The assignment to type uint64_t then proceeds (normally) by converting the negative right operand to ty uint64_t by adding 2<sup>64</sup> to bring it into the range of that type. (Such conversions are performed only for conversion to unsigned integer types, not to signed ones.) You are working on a little-endian system and printing out the resulting bytes in memory order, so you get:

> 00 00 00 FF FF FF FF FF

On the other hand, if you convert the shift operand to uint32_t on your 32-0bit system, ...

> u = ( (uint32_t) bytes[1]) << 24 ;
> dump_bytes_as_hex( &u, 8 );

... then the resulting type is unaffected by the standard arithmetic conversions, and the result of the conversion has that type. uint32_t can represent the arithmetic result of the shift (0xff000000), so that is in fact the result. That value is unchanged when converted to type uint64_t for the assignment.


General advice: use unsigned types when working with bitwise operations, and especially when performing shifts.

huangapple
  • 本文由 发表于 2023年4月11日 01:50:04
  • 转载请务必保留本文链接:https://go.coder-hub.com/75979441.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定