英文:
Endianness & Storing Characters into Unsigned Integers
问题
我正在初始化一个符号链接在一个ext2
inode中(学校作业)。
我想到要以十六进制方式进行操作,因为该字段被定义为uint32_t i_block[EXT2_N_BLOCKS]
。
例如:
#include <stdio.h>
int main () {
// unsigned int 在我的系统上是32位字长
unsigned int i = 0x68656c6c; // hello
printf("%.*s\n", 4, &i);
}
我得到了输出结果
lleh
这是因为我的系统是小端序吗?这是否意味着如果我硬编码相反的顺序,它将无法在大端序系统上运行(我的最终目标是hello-world
)?
将字符字符串存储到无符号整数数组中的最佳、最简单的方法是什么?
英文:
I am initializing a symlink in an ext2
inode (school assignment).
I got the idea to do it in hex
since the field is defined as uint32_t i_block[EXT2_N_BLOCKS]
.
As an example:
#include <stdio.h>
int main () {
// unsigned int is 32 bytes on my system
unsigned int i = 0x68656c6c; // hell
printf("%.*s\n", 4, &i");
I got the output
lleh
Is this because my system is little-endian? Does that mean if I hardcode the opposite order, it would not port to big-endian systems (my eventual goal is hello-world
)?
What is the best, most simple way to store a character string into an array of unsigned integers?
答案1
得分: 1
> Is this because my system is little-endian?
是的。
> Does that mean if I hardcode the opposite order, it would not port to big-endian systems
依赖整数字节顺序的代码确实不具备可移植性。
> What is the best, most simple way to store a character string into an array of unsigned integers?
最好的方法是根本不使用整数,而是使用 char
,它不依赖于字节顺序,实际上是设计用于存储字符的。
你也可以忽略它是整数类型,只需将字符串复制到它中:
unsigned int i;
memcpy(&i, "hell", 4);
或者如果你喜欢:memcpy(&i, "\x68\x65\x6c\x6c", 4);
。
否则,你将不得不发明一些丑陋的黑客方式,例如:
#define LITTLE_ENDIAN (*(unsigned char*) &(int){0xAA} == 0xAA)
unsigned int i = LITTLE_ENDIAN ? 0x6c6c6568 : 0x68656c6c;
英文:
> Is this because my system is little-endian?
Yes.
> Does that mean if I hardcode the opposite order, it would not port to big-endian systems
Code relying on the byte order of integers is non-portable indeed.
> What is the best, most simple way to store a character string into an array of unsigned integers?
The best way is not to use integers at all but char
, which unlike integers does not depend on endianess and was actually designed for the purpose of storing characters.
You could ignore that it is an integer type and just memcpy a string into it:
unsigned int i;
memcpy(&i, "hell", 4);
Or if you prefer: memcpy(&i, "\x68\x65\x6c\x6c", 4);
.
Otherwise you'll have to invent some ugly hack like for example:
#define LITTLE_ENDIAN (*(unsigned char*) &(int){0xAA} == 0xAA)
unsigned int i = LITTLE_ENDIAN ? 0x6c6c6568 : 0x68656c6c;
答案2
得分: 0
Strictly speaking, printf("%.*s\n", 4, &i);
是未定义行为 (UB),因为 "%.s"
期望一个指向字符的指针,而 &i
是一个指向 int
的指针。
更好的替代方法使用一个 union
。
union {
unsigned u;
unsigned char uc[sizeof(unsigned)];
} x = { .u = 0x68656c6c };
printf("%.*s\n", (int)sizeof x.uc, x.uc);
更好的做法是使用 uint32_t
而不是 unsigned
。
> 什么是将字符字符串存储到无符号整数数组中的最佳、最简单的方法?
通过 union
避免所有字节顺序的问题,并通过 .uc
成员进行初始化。
#include <stdio.h>
#define N 42
int main(void) {
union {
unsigned u[N];
unsigned char uc[sizeof(unsigned[N])];
} x = { .uc = "Hello" };
printf("<%.*s>\n", (int)sizeof x.uc, x.uc);
}
输出
<Hello>
请注意,.uc[]
可能不是一个带有足够长初始化器的字符串,因为它可能缺少一个空字符。
英文:
Strictly speaking, printf("%.*s\n", 4, &i");
is undefined behavior (UB) as "%.s"
expects a pointer to a character and &i
is a pointer to an int
.
A better alternative uses a union
.
union {
unsigned u;
unsigned char uc[sizeof (unsigned)];
} x = { .u = 0x68656c6c};
printf("%.*s\n", (int) sizeof x.uc, x.uc);
Even better, use uint32_t
instead of unsigned
.
> What is the best, most simple way to store a character string into an array of unsigned integers?
Avoid all endian concerns via a union
and initialize via the .uc
member.
#include <stdio.h>
#define N 42
int main(void) {
union {
unsigned u[N];
unsigned char uc[sizeof (unsigned[N])];
} x = { .uc = "Hello"};
printf("<%.*s>\n", (int) sizeof x.uc, x.uc);
}
Output
<Hello>
Note that .uc[]
might not be a string with a long enough initializer as it may lack a null character.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论