在C语言中,使用字符数组声明和字符指针声明时,字面值是如何存储的?

huangapple go评论85阅读模式
英文:

how literals in c are stored when defining with both character array declaration and character pointer declaration?

问题

我读到,在使用字符指针声明时,首先字符串文字会被存储在静态存储区,然后返回指向它的指针。但在使用字符数组中的文字时,是否会发生相同的情况?

在这里,我知道“instatic”会被存储在静态存储区,因为它是字符文字,然后test接收到第一个字符的地址
```c
char *test = "instatic";

但在这个声明中,它也使用了字符文字,它们首先会被定位在静态存储区,然后复制到堆栈中吗?

char test[] = "instatic?";
英文:

I read that while using character pointer declaration first the string literal is getting stored in static storage and then pointer to it is returned. but is same happening while using literals in char array?

here I know "instatic" gets stored in static storage because it is char literals and then test receives the address of the first character.

char *test = "instatic";

but in this declaration it also uses char literals and are they first located in static storage and then they get copied to stack?

char test[] = "instatic?";

答案1

得分: 4

正式地说,字符串字面值是源代码中用"括起来的一系列字符的文本,可以选择性地带有编码前缀u8uUL,符合C 2018 6.4.5 1的规定。

在C标准中使用的计算的抽象模型中,无论如何使用字符串字面值,都会导致创建一个静态存储期的数组,该数组使用字符串的内容进行初始化,包括一个终止空字符,根据6.4.5 6的规定。

这个静态数组可能会出现或不出现在编译器生成的实际程序中,这取决于它的使用方式以及编译器对程序的优化方式。换句话说,由C标准描述的抽象程序始终包含一个字符串字面值的静态数组,但是通过C编译器生成的实际程序在进行优化时可能会产生与抽象程序相同的行为,而实际上不使用单独的静态数组来存储字符串字面值,根据5.1.2.3 6的规定。

对于char *test = "instatic";这种情况,编译器通常必须创建静态数组,以便test可以指向它。(如果未使用test或者可能仅使用数组中的单个字符,则可能不会发生这种情况,因为编译器可以执行一些优化,使完整的数组不必要。)

对于在文件作用域中出现的char test[] = "instatic?";,编译器通常会生成定义带有字符串内容的数组test的目标代码,因此不需要单独的字符串静态数组。这是一种优化形式:在计算的抽象模型中,为字符串创建了一个数组,为test分配了单独的内存,并将字符串的内容复制到数组中。然而,在实际操作中,字符串的内容被设置为数组的内存的初始内容,作为程序加载过程的一部分。

对于在块作用域(函数内部)出现的char test[] = "instatic?";情况,编译器通常需要为字符串创建单独的静态数组,因为每次块开始执行时,编译器都必须为test数组分配内存,然后初始化它。这不能作为程序加载过程的一部分,因为它在程序执行期间必须多次发生。同样,根据情况,优化可能能够减少这一点。

英文:

Formally, a string literal is text in source code that is a sequence of characters enclosed in " marks, optionally with an encoding prefix of u8, u, U, or L, per C 2018 6.4.5 1.

In the abstract model of computing used in the C standard, any string literal, regardless of how it is used, results in the creation of an array of static storage duration that is initialized with the contents of the string, including a terminating null character, per 6.4.5 6.

That static array might or might not appear in the actual program generated by a compiler, depending on how it is used and how the compiler optimizes the program. In other words, the abstract program described by the C standard always contains a static array for a string literal, but the actual program produced by a C compiler, with optimization, might produce the same behavior as the abstract program without actually using a separate static array for the string literal, per 5.1.2.3 6.

In the case of char *test = "instatic";, the compiler generally must create the static array so that test can point to it. (This might not occur if test is not used or perhaps if only individual characters from the array are used, as the compiler could then perform some optimizations that make the full array unnecessary.)

In the case of char test[] = "instatic?"; appearing at file scope, a compiler typically produces object code that defines the array test with the contents of the string, so a separate static array for the string is not needed. This is a form of optimization: In the abstract model of computing, an array for the string is created, separate memory is allocated for test, and the contents of the string are copied into the array. However, in practice, the contents of the string are made to be the initial contents of memory for the array as part of the program loading process.

In the case of char test[] = "instatic?"; appearing at block scope (inside a function), the compiler generally needs to create that separate static array for the string, because, each time execution of the block begins, the compiler has to allocate memory for the test array and then initialize it. This cannot be made part of the program loading process because it has to happen multiple times during program execution. Again, optimization may be able to reduce this, depending on circumstances.

答案2

得分: 4

我读到,当使用字符指针声明时,首先将字符串字面值存储在静态存储中,然后返回指向它的指针。

这种描述方式有点令人困惑,好像有某种特殊情况的行为一样。实际上,这种行为自然地从语言规范的其他更基本方面产生:

  1. 字符串字面值表示具有静态存储期的char数组。

  2. 数组类型的表达式,如字符串字面值,在大多数情况下都会在其他情况下转化为指针。具体来说:

    除非它是sizeof运算符、typeof运算符、一元&运算符的操作数,或者是用于初始化数组的字符串字面值,具有类型“array of type”的表达式会转化为具有类型“pointer to type”的表达式,该指针指向数组对象的初始元素,并且不是一个左值。

    (C23 6.3.2.1/3)

因此,当您将字符串字面值用作除数组以外的任何初始化器时,它会自动转换为指针。在您的特定情况下,该指针适用于初始化指针变量test,所以一切都很好。

但是在将文字用于字符数组初始化时情况不同。

如上面的摘录所述,实际上用作数组初始化器的字符串字面值是一个特殊情况。通常适用的数组衰减在这里不会发生。相反,这种情况使用自己的初始化规则:

字符类型的数组可以通过字符字符串字面值或UTF-8字符串字面值初始化,可以选择用大括号括起来。字符串字面值的连续字节(包括终止的空字符,如果有空间或者数组大小未知的情况下)初始化数组的元素。

(C23 6.7.10/15)

它们首先位于静态存储中,然后复制到堆栈中吗?

如果要初始化的变量具有静态存储期(因为它在文件范围内声明或带有static存储类说明符),那么可能不会。从语义上讲,这样的变量在程序开始执行时已经具有其初始值。而且,这样的变量通常不存放在堆栈上。

但如果要初始化的变量具有自动存储期 - 即它是一个普通的局部变量 - 那么每次进入其作用域时(或多或少地)都会生成一个新对象,必须进行初始化。初始值需要存储在其他地方,并且确实在每次复制时进行复制。但是它的初始化来源对程序来说并不直接可见,实际上,它可能与指针情况下的字符串字面值初始化的存储情况是否类似有关。

英文:

> I read that while using character pointer declaration first the string
> literal is getting stored in static storage and then pointer to it is
> returned.

That's a somewhat confusing way to describe it, as if there were some kind of special case behavior. In fact, that behavior falls out naturally from other, more basic aspects of the language specification:

  1. A string literal represents an array of char with static storage duration.

  2. Expressions of array type, such as string literals, decay to pointers under most circumstances. Specifically:

    > Except when it is the operand of the sizeof operator, or typeof operators, or the unary & operator,
    > or is a string literal used to initialize an array, an expression that has type "array of type" is converted
    > to an expression with type "pointer to type" that points to the initial element of the array object and
    > is not an lvalue.

    (C23 6.3.2.1/3)

Thus, when you have a string literal as the initializer for anything other than an array, it is automatically converted to a pointer. In your particular case, that pointer is suitable for initializing pointer variable test, so all is good.

> but is same happening while using literals in char array?

No. As the excerpt above says, a string literal used as the initializer for an array in fact is a special case. The array decay that usually applies does not occur here. Instead, this case is covered with its own initialization rule:

> An array of character type may be initialized by a character string literal or UTF-8 string literal,
> optionally enclosed in braces. Successive bytes of the string literal (including the terminating null
> character if there is room or if the array is of unknown size) initialize the elements of the array.

(C23 6.7.10/15)

> and are they first located in static storage and then they get copied to stack?

If the variable being initialized has static storage duration (on account of being declared at file scope or with the static storage-class specifier) then probably not. Semantically, such variables have their initial values already when execution of the program begins. And, such variables generally do not live on the stack.

But if the variable being initialized has automatic storage duration -- that is, it is an ordinary local variable -- then (more or less) each entry into its scope produces a new object, which must be initialized. The initial value needs to be stored somewhere else, and indeed copied in every time. The source from which it is initialized is not directly visible to the program, however, and in practice, it might or might not be analogous to the storage for the string-literal initializer in the pointer case.

huangapple
  • 本文由 发表于 2023年7月27日 21:19:07
  • 转载请务必保留本文链接:https://go.coder-hub.com/76780167.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定