英文:
Endianness & Storing Characters into Unsigned Integers
问题
我正在初始化一个符号链接在一个ext2 inode中(学校作业)。
我想到要以十六进制方式进行操作,因为该字段被定义为uint32_t i_block[EXT2_N_BLOCKS]。
例如:
#include <stdio.h>
int main () {
// unsigned int 在我的系统上是32位字长
unsigned int i = 0x68656c6c; // hello
printf("%.*s\n", 4, &i);
}
我得到了输出结果
lleh
这是因为我的系统是小端序吗?这是否意味着如果我硬编码相反的顺序,它将无法在大端序系统上运行(我的最终目标是hello-world)?
将字符字符串存储到无符号整数数组中的最佳、最简单的方法是什么?
英文:
I am initializing a symlink in an ext2 inode (school assignment).
I got the idea to do it in hex
since the field is defined as uint32_t i_block[EXT2_N_BLOCKS].
As an example:
#include <stdio.h>
int main () {
// unsigned int is 32 bytes on my system
unsigned int i = 0x68656c6c; // hell
printf("%.*s\n", 4, &i");
I got the output
lleh
Is this because my system is little-endian? Does that mean if I hardcode the opposite order, it would not port to big-endian systems (my eventual goal is hello-world)?
What is the best, most simple way to store a character string into an array of unsigned integers?
答案1
得分: 1
> Is this because my system is little-endian?
是的。
> Does that mean if I hardcode the opposite order, it would not port to big-endian systems
依赖整数字节顺序的代码确实不具备可移植性。
> What is the best, most simple way to store a character string into an array of unsigned integers?
最好的方法是根本不使用整数,而是使用 char,它不依赖于字节顺序,实际上是设计用于存储字符的。
你也可以忽略它是整数类型,只需将字符串复制到它中:
unsigned int i;
memcpy(&i, "hell", 4);
或者如果你喜欢:memcpy(&i, "\x68\x65\x6c\x6c", 4);。
否则,你将不得不发明一些丑陋的黑客方式,例如:
#define LITTLE_ENDIAN (*(unsigned char*) &(int){0xAA} == 0xAA)
unsigned int i = LITTLE_ENDIAN ? 0x6c6c6568 : 0x68656c6c;
英文:
> Is this because my system is little-endian?
Yes.
> Does that mean if I hardcode the opposite order, it would not port to big-endian systems
Code relying on the byte order of integers is non-portable indeed.
> What is the best, most simple way to store a character string into an array of unsigned integers?
The best way is not to use integers at all but char, which unlike integers does not depend on endianess and was actually designed for the purpose of storing characters.
You could ignore that it is an integer type and just memcpy a string into it:
unsigned int i;
memcpy(&i, "hell", 4);
Or if you prefer: memcpy(&i, "\x68\x65\x6c\x6c", 4);.
Otherwise you'll have to invent some ugly hack like for example:
#define LITTLE_ENDIAN (*(unsigned char*) &(int){0xAA} == 0xAA)
unsigned int i = LITTLE_ENDIAN ? 0x6c6c6568 : 0x68656c6c;
答案2
得分: 0
Strictly speaking, printf("%.*s\n", 4, &i); 是未定义行为 (UB),因为 "%.s" 期望一个指向字符的指针,而 &i 是一个指向 int 的指针。
更好的替代方法使用一个 union。
union {
unsigned u;
unsigned char uc[sizeof(unsigned)];
} x = { .u = 0x68656c6c };
printf("%.*s\n", (int)sizeof x.uc, x.uc);
更好的做法是使用 uint32_t 而不是 unsigned。
> 什么是将字符字符串存储到无符号整数数组中的最佳、最简单的方法?
通过 union 避免所有字节顺序的问题,并通过 .uc 成员进行初始化。
#include <stdio.h>
#define N 42
int main(void) {
union {
unsigned u[N];
unsigned char uc[sizeof(unsigned[N])];
} x = { .uc = "Hello" };
printf("<%.*s>\n", (int)sizeof x.uc, x.uc);
}
输出
<Hello>
请注意,.uc[] 可能不是一个带有足够长初始化器的字符串,因为它可能缺少一个空字符。
英文:
Strictly speaking, printf("%.*s\n", 4, &i"); is undefined behavior (UB) as "%.s" expects a pointer to a character and &i is a pointer to an int.
A better alternative uses a union.
union {
unsigned u;
unsigned char uc[sizeof (unsigned)];
} x = { .u = 0x68656c6c};
printf("%.*s\n", (int) sizeof x.uc, x.uc);
Even better, use uint32_t instead of unsigned.
> What is the best, most simple way to store a character string into an array of unsigned integers?
Avoid all endian concerns via a union and initialize via the .uc member.
#include <stdio.h>
#define N 42
int main(void) {
union {
unsigned u[N];
unsigned char uc[sizeof (unsigned[N])];
} x = { .uc = "Hello"};
printf("<%.*s>\n", (int) sizeof x.uc, x.uc);
}
Output
<Hello>
Note that .uc[] might not be a string with a long enough initializer as it may lack a null character.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。


评论