Endianness 和将字符存储为无符号整数

huangapple go评论68阅读模式
英文:

Endianness & Storing Characters into Unsigned Integers

问题

我正在初始化一个符号链接在一个ext2 inode中(学校作业)。

我想到要以十六进制方式进行操作,因为该字段被定义为uint32_t i_block[EXT2_N_BLOCKS]

例如:

#include <stdio.h>

int main () {
  // unsigned int 在我的系统上是32位字长
  unsigned int i = 0x68656c6c; // hello
  printf("%.*s\n", 4, &i);
}

我得到了输出结果

lleh

这是因为我的系统是小端序吗?这是否意味着如果我硬编码相反的顺序,它将无法在大端序系统上运行(我的最终目标是hello-world)?

将字符字符串存储到无符号整数数组中的最佳、最简单的方法是什么?

英文:

I am initializing a symlink in an ext2 inode (school assignment).

I got the idea to do it in hex
since the field is defined as uint32_t i_block[EXT2_N_BLOCKS].

As an example:

#include &lt;stdio.h&gt;

int main () {
  // unsigned int is 32 bytes on my system
  unsigned int i = 0x68656c6c; // hell
  printf(&quot;%.*s\n&quot;, 4, &amp;i&quot;);

I got the output

lleh

Is this because my system is little-endian? Does that mean if I hardcode the opposite order, it would not port to big-endian systems (my eventual goal is hello-world)?

What is the best, most simple way to store a character string into an array of unsigned integers?

答案1

得分: 1

> Is this because my system is little-endian?

是的。

> Does that mean if I hardcode the opposite order, it would not port to big-endian systems

依赖整数字节顺序的代码确实不具备可移植性。

> What is the best, most simple way to store a character string into an array of unsigned integers?

最好的方法是根本不使用整数,而是使用 char,它不依赖于字节顺序,实际上是设计用于存储字符的。

你也可以忽略它是整数类型,只需将字符串复制到它中:

unsigned int i;
memcpy(&i, "hell", 4);

或者如果你喜欢:memcpy(&i, "\x68\x65\x6c\x6c", 4);

否则,你将不得不发明一些丑陋的黑客方式,例如:

#define LITTLE_ENDIAN  (*(unsigned char*) &(int){0xAA} == 0xAA)
unsigned int i = LITTLE_ENDIAN ? 0x6c6c6568 : 0x68656c6c;
英文:

> Is this because my system is little-endian?

Yes.

> Does that mean if I hardcode the opposite order, it would not port to big-endian systems

Code relying on the byte order of integers is non-portable indeed.

> What is the best, most simple way to store a character string into an array of unsigned integers?

The best way is not to use integers at all but char, which unlike integers does not depend on endianess and was actually designed for the purpose of storing characters.

You could ignore that it is an integer type and just memcpy a string into it:

unsigned int i;
memcpy(&amp;i, &quot;hell&quot;, 4);

Or if you prefer: memcpy(&amp;i, &quot;\x68\x65\x6c\x6c&quot;, 4);.

Otherwise you'll have to invent some ugly hack like for example:

#define LITTLE_ENDIAN  (*(unsigned char*) &amp;(int){0xAA} == 0xAA)
unsigned int i = LITTLE_ENDIAN ? 0x6c6c6568 : 0x68656c6c;

答案2

得分: 0

Strictly speaking, printf("%.*s\n", 4, &i); 是未定义行为 (UB),因为 "%.s" 期望一个指向字符的指针,而 &i 是一个指向 int 的指针。

更好的替代方法使用一个 union

union {
  unsigned u;
  unsigned char uc[sizeof(unsigned)];
} x = { .u = 0x68656c6c };

printf("%.*s\n", (int)sizeof x.uc, x.uc);

更好的做法是使用 uint32_t 而不是 unsigned

> 什么是将字符字符串存储到无符号整数数组中的最佳、最简单的方法?

通过 union 避免所有字节顺序的问题,并通过 .uc 成员进行初始化。

#include <stdio.h>
#define N 42

int main(void) {
  union {
    unsigned u[N];
    unsigned char uc[sizeof(unsigned[N])];
  } x = { .uc = "Hello" };
  printf("<%.*s>\n", (int)sizeof x.uc, x.uc);
}

输出

<Hello>

请注意,.uc[] 可能不是一个带有足够长初始化器的字符串,因为它可能缺少一个空字符。

英文:

Strictly speaking, printf(&quot;%.*s\n&quot;, 4, &amp;i&quot;); is undefined behavior (UB) as &quot;%.s&quot; expects a pointer to a character and &amp;i is a pointer to an int.

A better alternative uses a union.

union {
  unsigned u;
  unsigned char uc[sizeof (unsigned)];
} x = { .u = 0x68656c6c};

printf(&quot;%.*s\n&quot;, (int) sizeof x.uc, x.uc);

Even better, use uint32_t instead of unsigned.


> What is the best, most simple way to store a character string into an array of unsigned integers?

Avoid all endian concerns via a union and initialize via the .uc member.

#include &lt;stdio.h&gt;
#define N 42

int main(void) {
  union {
    unsigned u[N];
    unsigned char uc[sizeof (unsigned[N])];
  } x = { .uc = &quot;Hello&quot;};
  printf(&quot;&lt;%.*s&gt;\n&quot;, (int) sizeof x.uc, x.uc);
}

Output

&lt;Hello&gt;

Note that .uc[] might not be a string with a long enough initializer as it may lack a null character.

huangapple
  • 本文由 发表于 2023年5月22日 14:11:46
  • 转载请务必保留本文链接:https://go.coder-hub.com/76303429.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定