英文:
what happens if we compare unsigned short int and unsigned char (comparing the bits)? (in C language)
问题
例如:
unsigned char mask1 = 0x55; //01010101
unsigned short int mask2 = 0x8055;//1000000001010101
unsigned short int res = 0; //0000000000000000
res = mask1 | mask2;
那么现在res
的位是什么?
它将mask1
从2字节转换为4字节吗?并且"空白位置"将变成零吗?
我的意思是,从逻辑上讲,它将像这样工作吗?
res = mask1 | mask2 = 01010101 | 1000000001010101
0000000001010101
| 1000000001010101
----------------
1000000001010101
英文:
For example:
unsigned char mask1 = 0x55; //01010101
unsigned short int mask2 = 0x8055;//1000000001010101
unsigned short int res = 0; //0000000000000000
res = mask1 | mask2;
so what is res
now in bits?
does it convert mask1
from 2 bytes to 4? and the "empty spaces" will be zeroes?
I mean, in logical terms it will work like that?
res = mask1 | mask2 = 01010101 | 1000000001010101
0000000001010101
| 1000000001010101
----------------
1000000001010101
答案1
得分: 3
表达式的两个操作数
mask1|mask2;
由于整数提升而转换为类型int
(如果类型int
无法表示操作数的所有值,则转换为unsigned int
),以保留存储在操作数中的值。
根据C标准(6.5.12位按位或运算符)
3 对操作数执行通常的算术转换。
以及(6.3.1.8通常的算术转换)
1 许多需要算术类型的操作符会导致转换,并以类似的方式产生结果类型。其目的是确定操作数和结果的共同实际类型....
此模式称为通常的算术转换:否则(注意:如果没有一个操作数是实际类型 - 由我添加),将对两个操作数执行整数提升
以及(6.3.1.1布尔、字符和整数)
如果int可以表示原始类型的所有值(受位字段的宽度限制),则将值转换为int;否则,将其转换为unsigned int。这称为整数提升。其他所有类型都不受整数提升的影响。
因此,例如,存储在类型为unsigned char
的对象中的值0x55
将在内部表示为类型为int
的对象,如0x00000055
,前提是sizeof(int)
等于4
。而存储在类型为unsigned short
的对象中的值0x8055
将在内部表示为0x00008055
在这个赋值语句中
res= mask1|mask2;
结果将被转换回类型unsigned short
。
英文:
The both operands of the expression
mask1|mask2;
are converted to the type int
(or unsigned int
if the type int
is unable to represent all values of the operands) due to the integer promotions preserving values stored in the operands.
From the C Standard (6.5.12 Bitwise inclusive OR operator)
> 3 The usual arithmetic conversions are performed on the operands.
and
and (6.3.1.8 Usual arithmetic conversions)
> 1 Many operators that expect operands of arithmetic type cause
> conversions and yield result types in a similar way. The purpose is to
> determine a common real type for the operands and result....
>
> This pattern is called the usual arithmetic conversions:
> Otherwise (note: if neither operand is of a real type - added by me),
> the integer promotions are performed on both operands
and (6.3.1.1 Boolean, characters, and integers)
> If an int can represent all values of the original type (as restricted
> by the width, for a bit-field), the value is converted to an int;
> otherwise, it is converted to an unsigned int. These are called the
> integer promotions. All other types are unchanged by the integer
> promotions.
So for example the value 0x55
stored in an object of the type unsigned char
internally will be represented in an object of the type int
like 0x00000055
provided that the sizeof( int )
is equal to 4
. And the value 0x8055
stored in an object of the type unsigned short
will be represented internally like 0x00008055
In this assignment
res= mask1|mask2;
the result will be converted back to the type unsigned short
.
答案2
得分: 3
这个答案将假设一个具有8位字符类型、16位short
和32位int
的主流系统(这是现实世界中所有主流32位和64位系统的情况)。
首先查看 https://stackoverflow.com/questions/46073295/implicit-type-promotion-rules。在这种特定情况下的工作原理如下:
C语言中的每个操作数都带有自己的一行,说明如何处理隐式提升。对于|
操作符,我们可以查看C17 6.5.12 "按位包含OR运算符":
约束
每个操作数必须具有整数类型。
语义
对操作数执行通常的算术转换。
正如我们从本帖顶部的链接中了解到的那样,整数提升是通常的算术转换的一部分。因此,在表达式res = mask1 | mask2;
中,两个操作数都是小整数类型,因此它们被提升为带符号的int
。这有点不幸,因为我们想要避免使用带符号操作数进行位运算,但在这种特定情况下没有区别。我们将得到0x8055和0x55而不是0x00008055和0x00000055 - 基本上只是零填充。
因此,它与res = (int)mask1 | (int)mask2;
完全等效,mask1 | mask2
的结果是int
类型。
接下来,这个值存储在类型为unsigned short
的res
中。然后发生“赋值期间的转换”,6.5.16:
在“简单赋值”(
=
)中,右操作数的值被转换为赋值表达式的类型,并替换左操作数指定的对象中存储的值。
关于这种转换所涉及的具体规则可以在C17 6.3.1.3中找到:
否则,如果新类型是无符号的,则通过反复添加或减去新类型中可以表示的最大值加1的值,直到该值在新类型的范围内为止进行转换。
这种转换就像模数运算,或者可以说是对原始值的二进制截断,其中最高位的字节被简单地丢弃。
在这种特定情况下,我们有一个值为0x00008055
的int
,它被转换为值为0x8055
的unsigned short
。
关于“赋值期间的转换”的一个有趣说明是,它也发生在所有这些行上:
unsigned char mask1 = 0x55; //01010101
unsigned short int mask2 = 0x8055; //1000000001010101
unsigned short int res = 0;
这里的数字,0x55
等等,正式称为“整数常量”。C语言中的整数常量具有根据各种复杂规则选择的类型(C17 6.4.4.1) - 我不会在这里提到它们,但现在我们可以注意到整数常量永远不会小于int
类型。因此,在上述所有初始化期间,都存在从int
到左操作数类型的隐式转换。
英文:
This answer will assume a mainstream system with 8 bit character types, 16 bit short
and 32 bit int
(this is the case for all mainstream 32 and 64 bitters in the real world).
First check out https://stackoverflow.com/questions/46073295/implicit-type-promotion-rules. How this works in this particular case:
Each operand in C comes with it's own little line stating how implicit promotions are handled. In case of |
, we can peek at C17 6.5.12 "the bitwise inclusive OR operator":
>Constraints
>
>Each of the operands shall have integer type.
>
> Semantics
> The usual arithmetic conversions are performed on the operands.
As we learned from the link at the top of this post, the integer promotions are part of the usual arithmetic conversions. So in the expression res = mask1 | mask2;
, both operands are small integer types and therefore promoted to int
which is signed. Which is a bit unfortunate since we want to avoid bitwise arithmetic using signed operands like the plague, though in this specific case it makes no difference. Instead of 0x8055 and 0x55 we will get 0x00008055 and 0x00000055 - basically just zero padding.
Thus it is 100% equivalent to res = (int)mask1 | (int)mask2;
and the result of mask1 | mask2
is of type int
.
Next up this is stored in res
which is of type unsigned short
. What happens then is "conversion during assignment", 6.5.16:
> In simple assignment (=
), the value of the right operand is converted to the type of the
assignment expression and replaces the value stored in the object designated by the left operand.
The specific rules for what this conversion entails is found in C17 6.3.1.3:
> Otherwise, if the new type is unsigned, the value is converted by repeatedly adding or subtracting one more than the maximum value that can be represented in the new type until the value is in the range of the new type
This conversion works like modulus, or if you will a binary truncation of the raw value where the most significant bytes are simply discarded.
In this specific case we have an int
with value 0x00008055
and it is converted to an unsigned short
with value 0x8055
.
A curious note regarding "conversion during assignment" is that is also happens on all of these lines:
unsigned char mask1 = 0x55; //01010101
unsigned short int mask2 = 0x8055;//1000000001010101
unsigned short int res = 0;
The numbers here, 0x55
and so forth, are formally called integer constants. Integer constants in C have a type picked based on various intricate rules (C17 6.4.4.1) - I won't mention them here but for now we can note that an integer constant can never be of a smaller type than int
. So during all of the above initializations, we have implicit conversion from int
to the type of the left operand.
答案3
得分: 0
发生了什么...比较
unsigned short int
和unsigned char
...?
OP大部分是正确的。
通常的提升
像|, ^, &, +, -
等运算符首先会将比int/unsigned
窄的对象提升为int/unsigned
,但_值_不会发生变化。如果int
包含了窄类型的范围,那么该对象变成了int
,否则它变成了unsigned
。
在OP的情况下,unsigned short int
可能会提升为int
(或者如果unsigned
是16位的话可能是unsigned
)。而unsigned char
肯定会变成int
。
转换为公共类型
然后,_排名较低_的对象会转换为与排名较高的对象相同的类型。这可能涉及到值的变化,比如将负的int
转换为unsigned
或将一些整数转换为浮点数。
在OP的情况下,这两个操作数然后变成了int
(或者如果unsigned
是16位的话可能是unsigned
)。在OP的值中,在这一步中不会发生值的变化。
应用操作符|
mask1 | mask2
然后几乎与OP所猜测的一样,结果是2个int
。
0b00000000`00000000`00000000`01010101
| 0b00000000`00000000`10000000`01010101
-------------------------------------
0b00000000`00000000`10000000`01010101
或者使用16位的int/unsigned
,结果是2个unsigned
。
0b00000000`01010101
| 0b10000000`01010101
-------------------
0b10000000`01010101
赋值会缩小类型
然后,int
(或unsigned
)的结果被转换为unsigned short
,然后赋值。
当值可以在新的窄类型中表示时,就保存该值。否则,如果目标类型是_有符号_整数,则以一种实现定义的方式进行转换。最常见的实现定义方式简单地使用最低有效位。否则,如果目标类型是_无符号_整数,则使用最低有效位(即"mod"(最大值+1))。
在OP的情况下,mask1 | mask2
的结果在unsigned short
范围内,所以保存的是0b1000000001010101
。
英文:
> what happens .. compare unsigned short int
and unsigned char
...?
OP mostly has it.
Usual promotions
Operators like |, ^, &, + , -
and others first promote each object that is narrower than int/unsigned
to int/unsigned
with no change in value. If int
encompasses the narrow type range, the object becomes an int
, otherwise it becomes an unsigned
.
In OP's case, the unsigned short int
likely promotes to int
(or possibly unsigned
if unsigned
is 16-bit). The unsigned char
certainly becomes an int
.
Conversion to common type
The lower ranked object is then convert to the same as the higher ranked one. This may involve a value change as a negative int
being converted to an unsigned
or some integers converted to floating point.
In OP's case, the 2 operands are then int
(or possibly unsigned
if unsigned
is 16-bit). With OP's values, no value change occurs in this step.
Operator |
applied
mask1 | mask2
then does almost as OP supposes as 2 int
s.
0b00000000`00000000`00000000`01010101
| 0b00000000`00000000`10000000`01010101
-------------------------------------
0b00000000`00000000`10000000`01010101
Or with 16-bit int/unsigned
as 2 unsigned
.
0b00000000`01010101
| 0b10000000`01010101
-------------------
0b10000000`01010101
Assignment narrows the type
The int
(or unsigned
) result is then converted to unsigned short
and then assigned.
When the value is representable in the new narrow type, that is the value saved. Otherwise if the destination type is a signed integer, the value is converted in an implementation defined manner. The most common implementation defined manner simply uses the least significant bits. Otherwise if the destination type is an unsigned integer, the least significant bits are used (i.e. "mod" (max value + 1)).
In OP's case, the result of mask1 | mask2
results in a value in the unsigned short
range and so 0b1000000001010101
is saved.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论