what happens if we compare unsigned short int and unsigned char (comparing the bits)? (in C language)

huangapple go评论75阅读模式
英文:

what happens if we compare unsigned short int and unsigned char (comparing the bits)? (in C language)

问题

例如:

    unsigned char mask1 = 0x55; //01010101
    unsigned short int mask2 = 0x8055;//1000000001010101   
    unsigned short int res = 0; //0000000000000000
    res = mask1 | mask2;

那么现在res的位是什么?

它将mask1从2字节转换为4字节吗?并且"空白位置"将变成零吗?

我的意思是,从逻辑上讲,它将像这样工作吗?

res = mask1 | mask2 = 01010101 | 1000000001010101 

  0000000001010101
| 1000000001010101
  ----------------
  1000000001010101
英文:

For example:

    unsigned char mask1 = 0x55; //01010101
    unsigned short int mask2 = 0x8055;//1000000001010101   
    unsigned short int res = 0; //0000000000000000
    res = mask1 | mask2;

so what is res now in bits?

does it convert mask1 from 2 bytes to 4? and the "empty spaces" will be zeroes?

I mean, in logical terms it will work like that?

res = mask1 | mask2 = 01010101 | 1000000001010101 

  0000000001010101
| 1000000001010101
  ----------------
  1000000001010101

答案1

得分: 3

表达式的两个操作数

mask1|mask2;

由于整数提升而转换为类型int(如果类型int无法表示操作数的所有值,则转换为unsigned int),以保留存储在操作数中的值。

根据C标准(6.5.12位按位或运算符)

3 对操作数执行通常的算术转换。

以及(6.3.1.8通常的算术转换)

1 许多需要算术类型的操作符会导致转换,并以类似的方式产生结果类型。其目的是确定操作数和结果的共同实际类型....

此模式称为通常的算术转换:否则(注意:如果没有一个操作数是实际类型 - 由我添加),将对两个操作数执行整数提升

以及(6.3.1.1布尔、字符和整数)

如果int可以表示原始类型的所有值(受位字段的宽度限制),则将值转换为int;否则,将其转换为unsigned int。这称为整数提升。其他所有类型都不受整数提升的影响。

因此,例如,存储在类型为unsigned char的对象中的值0x55将在内部表示为类型为int的对象,如0x00000055,前提是sizeof(int)等于4。而存储在类型为unsigned short的对象中的值0x8055将在内部表示为0x00008055

在这个赋值语句中

res= mask1|mask2;

结果将被转换回类型unsigned short

英文:

The both operands of the expression

mask1|mask2;

are converted to the type int (or unsigned int if the type int is unable to represent all values of the operands) due to the integer promotions preserving values stored in the operands.

From the C Standard (6.5.12 Bitwise inclusive OR operator)

> 3 The usual arithmetic conversions are performed on the operands.

and

and (6.3.1.8 Usual arithmetic conversions)

> 1 Many operators that expect operands of arithmetic type cause
> conversions and yield result types in a similar way. The purpose is to
> determine a common real type for the operands and result....
>
> This pattern is called the usual arithmetic conversions:
> Otherwise (note: if neither operand is of a real type - added by me),
> the integer promotions are performed on both operands

and (6.3.1.1 Boolean, characters, and integers)

> If an int can represent all values of the original type (as restricted
> by the width, for a bit-field), the value is converted to an int;
> otherwise, it is converted to an unsigned int. These are called the
> integer promotions. All other types are unchanged by the integer
> promotions.

So for example the value 0x55 stored in an object of the type unsigned char internally will be represented in an object of the type int like 0x00000055 provided that the sizeof( int ) is equal to 4. And the value 0x8055 stored in an object of the type unsigned short will be represented internally like 0x00008055

In this assignment

res= mask1|mask2;

the result will be converted back to the type unsigned short.

答案2

得分: 3

这个答案将假设一个具有8位字符类型、16位short和32位int的主流系统(这是现实世界中所有主流32位和64位系统的情况)。

首先查看 https://stackoverflow.com/questions/46073295/implicit-type-promotion-rules。在这种特定情况下的工作原理如下:

C语言中的每个操作数都带有自己的一行,说明如何处理隐式提升。对于|操作符,我们可以查看C17 6.5.12 "按位包含OR运算符":

约束

每个操作数必须具有整数类型。

语义

对操作数执行通常的算术转换。

正如我们从本帖顶部的链接中了解到的那样,整数提升是通常的算术转换的一部分。因此,在表达式res = mask1 | mask2;中,两个操作数都是小整数类型,因此它们被提升为带符号的int。这有点不幸,因为我们想要避免使用带符号操作数进行位运算,但在这种特定情况下没有区别。我们将得到0x8055和0x55而不是0x00008055和0x00000055 - 基本上只是零填充。

因此,它与res = (int)mask1 | (int)mask2;完全等效,mask1 | mask2的结果是int类型。


接下来,这个值存储在类型为unsigned shortres中。然后发生“赋值期间的转换”,6.5.16:

在“简单赋值”(=)中,右操作数的值被转换为赋值表达式的类型,并替换左操作数指定的对象中存储的值。

关于这种转换所涉及的具体规则可以在C17 6.3.1.3中找到:

否则,如果新类型是无符号的,则通过反复添加或减去新类型中可以表示的最大值加1的值,直到该值在新类型的范围内为止进行转换。

这种转换就像模数运算,或者可以说是对原始值的二进制截断,其中最高位的字节被简单地丢弃。

在这种特定情况下,我们有一个值为0x00008055int,它被转换为值为0x8055unsigned short


关于“赋值期间的转换”的一个有趣说明是,它也发生在所有这些行上:

unsigned char mask1 = 0x55; //01010101
unsigned short int mask2 = 0x8055; //1000000001010101
unsigned short int res = 0;

这里的数字,0x55等等,正式称为“整数常量”。C语言中的整数常量具有根据各种复杂规则选择的类型(C17 6.4.4.1) - 我不会在这里提到它们,但现在我们可以注意到整数常量永远不会小于int类型。因此,在上述所有初始化期间,都存在从int到左操作数类型的隐式转换。

英文:

This answer will assume a mainstream system with 8 bit character types, 16 bit short and 32 bit int (this is the case for all mainstream 32 and 64 bitters in the real world).

First check out https://stackoverflow.com/questions/46073295/implicit-type-promotion-rules. How this works in this particular case:

Each operand in C comes with it's own little line stating how implicit promotions are handled. In case of |, we can peek at C17 6.5.12 "the bitwise inclusive OR operator":

>Constraints
>
>Each of the operands shall have integer type.
>
> Semantics
> The usual arithmetic conversions are performed on the operands.

As we learned from the link at the top of this post, the integer promotions are part of the usual arithmetic conversions. So in the expression res = mask1 | mask2;, both operands are small integer types and therefore promoted to int which is signed. Which is a bit unfortunate since we want to avoid bitwise arithmetic using signed operands like the plague, though in this specific case it makes no difference. Instead of 0x8055 and 0x55 we will get 0x00008055 and 0x00000055 - basically just zero padding.

Thus it is 100% equivalent to res = (int)mask1 | (int)mask2; and the result of mask1 | mask2 is of type int.


Next up this is stored in res which is of type unsigned short. What happens then is "conversion during assignment", 6.5.16:

> In simple assignment (=), the value of the right operand is converted to the type of the
assignment expression and replaces the value stored in the object designated by the left operand.

The specific rules for what this conversion entails is found in C17 6.3.1.3:

> Otherwise, if the new type is unsigned, the value is converted by repeatedly adding or subtracting one more than the maximum value that can be represented in the new type until the value is in the range of the new type

This conversion works like modulus, or if you will a binary truncation of the raw value where the most significant bytes are simply discarded.

In this specific case we have an int with value 0x00008055 and it is converted to an unsigned short with value 0x8055.


A curious note regarding "conversion during assignment" is that is also happens on all of these lines:

unsigned char mask1 = 0x55; //01010101
unsigned short int mask2 = 0x8055;//1000000001010101   
unsigned short int res = 0;

The numbers here, 0x55 and so forth, are formally called integer constants. Integer constants in C have a type picked based on various intricate rules (C17 6.4.4.1) - I won't mention them here but for now we can note that an integer constant can never be of a smaller type than int. So during all of the above initializations, we have implicit conversion from int to the type of the left operand.

答案3

得分: 0

发生了什么...比较unsigned short intunsigned char...?

OP大部分是正确的。

通常的提升

|, ^, &, +, -等运算符首先会将比int/unsigned窄的对象提升为int/unsigned,但_值_不会发生变化。如果int包含了窄类型的范围,那么该对象变成了int,否则它变成了unsigned

在OP的情况下,unsigned short int可能会提升为int(或者如果unsigned是16位的话可能是unsigned)。而unsigned char肯定会变成int

转换为公共类型

然后,_排名较低_的对象会转换为与排名较高的对象相同的类型。这可能涉及到值的变化,比如将负的int转换为unsigned或将一些整数转换为浮点数。

在OP的情况下,这两个操作数然后变成了int(或者如果unsigned是16位的话可能是unsigned)。在OP的值中,在这一步中不会发生值的变化。

应用操作符|

mask1 | mask2然后几乎与OP所猜测的一样,结果是2个int

  0b00000000`00000000`00000000`01010101
| 0b00000000`00000000`10000000`01010101
  -------------------------------------
  0b00000000`00000000`10000000`01010101

或者使用16位的int/unsigned,结果是2个unsigned

  0b00000000`01010101
| 0b10000000`01010101
  -------------------
  0b10000000`01010101

赋值会缩小类型

然后,int(或unsigned)的结果被转换为unsigned short,然后赋值。

当值可以在新的窄类型中表示时,就保存该值。否则,如果目标类型是_有符号_整数,则以一种实现定义的方式进行转换。最常见的实现定义方式简单地使用最低有效位。否则,如果目标类型是_无符号_整数,则使用最低有效位(即"mod"(最大值+1))。

在OP的情况下,mask1 | mask2的结果在unsigned short范围内,所以保存的是0b1000000001010101

英文:

> what happens .. compare unsigned short int and unsigned char ...?

OP mostly has it.

Usual promotions

Operators like |, ^, &, + , - and others first promote each object that is narrower than int/unsigned to int/unsigned with no change in value. If int encompasses the narrow type range, the object becomes an int, otherwise it becomes an unsigned.

In OP's case, the unsigned short int likely promotes to int (or possibly unsigned if unsigned is 16-bit). The unsigned char certainly becomes an int.

Conversion to common type

The lower ranked object is then convert to the same as the higher ranked one. This may involve a value change as a negative int being converted to an unsigned or some integers converted to floating point.

In OP's case, the 2 operands are then int (or possibly unsigned if unsigned is 16-bit). With OP's values, no value change occurs in this step.

Operator | applied

mask1 | mask2 then does almost as OP supposes as 2 ints.

  0b00000000`00000000`00000000`01010101
| 0b00000000`00000000`10000000`01010101
  -------------------------------------
  0b00000000`00000000`10000000`01010101

Or with 16-bit int/unsigned as 2 unsigned.

  0b00000000`01010101
| 0b10000000`01010101
  -------------------
  0b10000000`01010101

Assignment narrows the type

The int (or unsigned) result is then converted to unsigned short and then assigned.

When the value is representable in the new narrow type, that is the value saved. Otherwise if the destination type is a signed integer, the value is converted in an implementation defined manner. The most common implementation defined manner simply uses the least significant bits. Otherwise if the destination type is an unsigned integer, the least significant bits are used (i.e. "mod" (max value + 1)).

In OP's case, the result of mask1 | mask2 results in a value in the unsigned short range and so 0b1000000001010101 is saved.

huangapple
  • 本文由 发表于 2023年6月22日 16:57:35
  • 转载请务必保留本文链接:https://go.coder-hub.com/76530170.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定