C++支持确保不占用内存的命名常量吗?

huangapple go评论57阅读模式
英文:

Does C++ support named constants which are guaranteed to not take up memory?

问题

问题更多是学术性的,因为即使文字也最终存储在内存中,至少在用于指令的机器代码中是如此。 不过,是否有一种方法可以确保标识符在编译时被丢弃,而不会变成实质上带有内存位置的受限变量?

英文:

The question is more academic because even a literal is also eventually stored in memory, at least in the machine code for the instruction it is used in. Still, is there a way to ensure that an identifier will be done away with at compile time and not turn into what is essentially a handicapped variable with memory location and all?

答案1

得分: 5

抱歉,不。C++不指定对象格式,因此也不指定对象文件中确切包含了什么,以及什么不包含在其中。实现可以自由地将尽可能多的额外信息打包到二进制文件中,甚至可以省略它们认为在“as-if”规则下不必要的内容。

实际上,我们可以进行一个非常简单的思想实验来得出明确的答案。C++并不要求存在编译器。符合规范的C++解释器是C++标准的完全有效实现。这个解释器可以将你的C++代码解析为抽象语法树(AST)并将其序列化到磁盘。要执行它,它加载AST并逐行评估它,依次执行一行C++代码。你的constexpr变量,#defineenum常量等都会被必要地加载到内存中(这甚至不像你可能认为的那么虚构:这正是在编译时的常量求值期间发生的情况)。

换句话说:C++标准没有对象格式的概念,甚至没有编译的概念。由于它甚至不知道什么是编译,因此它无法指定编译过程的任何细节,所以在编译过程中保留什么和丢弃什么没有规则。

C++抽象机器再次发挥作用。

实际上,有一些架构(如ARM)没有指令可以将任意立即数加载到寄存器中,这意味着甚至普通的整数文字,如1283572434,都会放入内存中的一个专用常量变量部分,你可以获取其地址。对于太大以至于无法通过常规的mov reg, imm指令加载的常量,x86-64的编译器也会这样做。非常大的256位甚至512位常量通常通过从内存中的某个常量部分加载到矢量寄存器中。

大多数编译器当然足够智能,可以优化掉仅在编译时使用的常量。不过,这不受标准的保证,甚至不受编译器自身的保证。

以下是GCC将#define的常量放入变量中并在需要时从内存加载的示例(Godbolt):

#include <immintrin.h>

#define THAT_VERY_LARGE_VALUE __m256i{1111, 2222, 3333, 4444}

__m256i getThatValue() {
    return THAT_VERY_LARGE_VALUE;
}
英文:

Unfortunately, no. C++ doesn't specify the object format, and therefore, it also doesn't specify what exactly goes into the object file and what doesn't. Implementations are free to pack as much extra stuff into the binary as they want, or even omit things that they determine to not be necessary under the as-if rule.

In fact, we can make a very simple thought experiment to come to a definitive answer. C++ doesn't require there to be a compiler at all. A conformant C++ interpreter is a perfectly valid implementation of the C++ standard. This interpreter could parse your C++ code into an Abstract Syntax Tree and serialize it to disk. To execute it, it loads the AST and evaluates it, one line of C++ code after the other. Your constexpr variable, #define, enum constants, etc all get loaded into memory by necessity. (This isn't even as hypothetical as you might think: It's exactly what happens during constant evaluation at compile time.)

In other words: The C++ standard has no concept of object format or even compilation. Since it doesn't know what compilation even is, it can't specify any details of that process, so there are no rules on what's kept and what's thrown away during compilation.

The C++ Abstract Machine strikes again.

In practice, there are architectures (like ARM) that don't have instructions to load arbitrary immediates into registers, which means that even a plain old integer literal like 1283572434 will go into a dedicated constant variable section in memory, which you can take the address of. The same can and will happen with constexpr variables, enums, and even #define.

Compilers for x86-64 do this as well for constants that are too large to load via regular mov reg, imm instructions. Very large 256-bit or even 512-bit constants are generally loaded into vector registers by loading them from a constant section somewhere in memory.

Most compilers are of course smart enough to optimize away constants that are only used at compile time. It's not guaranteed by the standard, though, and not even by the compilers themselves.

Here's an example where GCC places a #define-d constant into a variable and loads it from memory when needed (Godbolt):

#include &lt;immintrin.h&gt;

#define THAT_VERY_LARGE_VALUE __m256i{1111, 2222, 3333, 4444}

__m256i getThatValue() {
    return THAT_VERY_LARGE_VALUE;
}

答案2

得分: 2

  1. 标准方式是 enum。它有3种形式:

    • enum {THE_VALUE = 42};

      用法:std::cout << THE_VALUE;

    • enum MyContainerForConstants {THE_VALUE = 42};

      用法:同上,并且还可以使用 std::cout << MyContainerForConstants::THE_VALUE;

    • enum: unsigned short {THE_VALUE = 42};

      如果需要,您可以指定一个类型。

  2. 宏:#define

    #define THE_VALUE 42

    用法:std::cout << THE_VALUE;

  3. consteval 函数。如果您的常量需要非平凡的代码来计算,请使用此方法。

     consteval int the_other_value()
     {
         int r = 0;
         for (int i = 0; i < 10; ++i)
             r += i;
         return r;
     }
    

    用法:std::cout << the_other_value();

如果值恰好为0,它可能不会出现在代码中:例如,返回0的函数在其机器代码中有 xor eax, eax — 字面的0不会出现在那里。但对于所有其他值,常量将出现在机器代码中(至少在使用x86/x64机器代码时是如此)。

虽然有可能混淆机器代码并隐藏常数,但没有编译器支持这个无用的特性。

英文:
  1. The standard way is enum. It has 3 forms:

    • enum {THE_VALUE = 42};

      Usage: std::cout &lt;&lt; THE_VALUE;

    • enum MyContainerForConstants {THE_VALUE = 42};

      Usage: as above, and also std::cout &lt;&lt; MyContainerForConstants::THE_VALUE;

    • enum: unsigned short {THE_VALUE = 42};

      You can specify a type if you want.

  2. A macro: #define

    #define THE_VALUE 42

    Usage: std::cout &lt;&lt; THE_VALUE;

  3. A consteval function. Use this if your constant requires non-trivial code to calculate.

     consteval int the_other_value()
     {
         int r = 0;
         for (int i = 0; i &lt; 10; ++i)
             r += i;
         return r;
     }
    

    Usage: std::cout &lt;&lt; the_other_value();

If the value happens to be 0, it may not appear in code: for example, a function returning 0 has xor eax, eax in its machine code — the literal 0 doesn't appear there. But for all other values, the constant will appear in the machine code (at least if you use x86/x64 machine code).

While it's possible to obfuscate the machine code and hide constant numbers, no compiler supports this useless feature.

huangapple
  • 本文由 发表于 2023年2月7日 04:34:08
  • 转载请务必保留本文链接:https://go.coder-hub.com/75366298.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定