英文:
'nm' reports different sizes for variables of the same type. How do I find out their real size?
问题
我试图使用`nm`来查找程序中变量的地址和大小,然后我突然意识到我的许多变量都比预期的要大。我创建了一个名为"test.c"的测试文件:
```c
static char test1 = 0;
static char test2 = 0;
char test_f(void)
{
test1 = test2;
return test2;
}
int main(void)
{
return test_f();
}
然后我运行了以下命令:
gcc test.c
nm -C -S --size-sort a.exe | findstr /rc:"test"
输出是
0000000140007040 0000000000000001 b test1
0000000140007041 000000000000000f b test2
0000000140001540 000000000000001a T test_f
我假设这里涉及到某种填充/对齐,但我不明白为什么填充成了符号的一部分。有没有办法产生一个类似的文本日志,其中test1
和test2
的大小都是1?
<details>
<summary>英文:</summary>
I was trying to find out the addresses and sizes of variables in my program using `nm`, and I just realized a whole bunch of my variables are unexpectedly large. I made a following test file, "test.c":
static char test1 = 0;
static char test2 = 0;
char test_f(void)
{
test1 = test2;
return test2;
}
int main(void)
{
return test_f();
}
Then I run the following commands:
gcc test.c
nm -C -S --size-sort a.exe | findstr /rc:"test"
And the output is
0000000140007040 0000000000000001 b test1
0000000140007041 000000000000000f b test2
0000000140001540 000000000000001a T test_f
I assume some sort of padding / alignment is at play here, but I don't understand why the padding became part of a symbol. Is there a way to produce a similar text log in which `test1` and `test2` would both have the size of 1?
</details>
# 答案1
**得分**: 2
"我不明白为什么填充成为符号的一部分。\n\n[`nm`手册](https://linux.die.net/man/1/nm)说:“大小是根据符号的值与具有下一个更高值的符号的值之间的差异计算的。” 因此,如果在变量A之后和变量B之前存在填充,那么填充将显示为变量A的大小的一部分。\n\n在您的示例中,`test1` 显然紧跟着 `test2`,因此 `test1` 的大小被计算为一个字节。 `test2` 后面没有明确的符号存在于其程序段;`nm` 使用的下一个“符号”可能是下一部分的开头或其中的第一个符号。 下一部分具有一些对齐要求,因此在 `test2` 和下一部分之间存在未使用的空间,也称为填充。因此,`test2` 与下一个“符号”之间的差异包括该填充,并且它显示在 `test2` 的“大小”中,如手册中所述。"
<details>
<summary>英文:</summary>
> … I don't understand why the padding became part of a symbol.
The [`nm` man page](https://linux.die.net/man/1/nm) says “The size is computed as the difference between the value of the symbol and the value of the symbol with the next higher value.” Therefore, if there is padding after variable A and before variable B, the padding will appear as part of the size of A.
In your example, `test1` was apparently immediately followed by `test2`, so the size of `test1` was computed as one byte. `test2` was not followed by any explicit symbol in its program section; the next “symbol” that `nm` used may have been the beginning of the next section or the first symbol in it. That next section has some alignment requirement, so there is unused space, also called padding, after `test2` and before the next section. So the difference between `test2` and the next “symbol” includes that padding, and it shows up in the “size” of `test2` as the man page states.
</details>
# 答案2
**得分**: 2
I don't understand why the padding became part of a symbol
根据 man 手册:https://man7.org/linux/man-pages/man1/nm.1.html
ELF 格式记录了符号的大小,其他格式(如 EXE)只会报告大小,从此符号的开始到下一个符号的开始的间隔。
Is there a way to produce a similar text log in which test1 and test2
would both have the size of 1?
在 Linux 中执行相同的步骤,使用 ELF 二进制文件。
0000000000004011 0000000000000001 b test1
0000000000004012 0000000000000001 b test2
<details>
<summary>英文:</summary>
> I don't understand why the padding became part of a symbol
According to the man page: https://man7.org/linux/man-pages/man1/nm.1.html
the ELF format records sizes for symbols, other formats (like EXE) will only report size as the interval from the start of this symbol to the start of the next one.
> Is there a way to produce a similar text log in which test1 and test2
> would both have the size of 1?
Do the same steps in Linux which uses ELF binaries.
0000000000004011 0000000000000001 b test1
0000000000004012 0000000000000001 b test2
</details>
# 答案3
**得分**: 0
在结尾处,似乎没有办法在没有调试信息的情况下知道PE文件中的符号大小。使用调试信息,可以使用`objdump -W a.exe`提取它,然后通过`DW_AT_type`查找符号类型,接着使用`DW_AT_byte_size`找到该类型的大小:
<1><28ce>: 缩略号编号: 2 (DW_TAG_variable)
<28cf> DW_AT_name : test1
<28d5> DW_AT_decl_file : 1
<28d6> DW_AT_decl_line : 3
<28d7> DW_AT_decl_column : 13
<28d8> DW_AT_type : <0x28e6>
<28dc> DW_AT_location : 9 字节块: 3 40 70 0 40 1 0 0 0 (DW_OP_addr: 140007040)
<1><28e6>: 缩略号编号: 3 (DW_TAG_base_type)
<28e7> DW_AT_byte_size : 1
<28e8> DW_AT_encoding : 6 (有符号字符)
<28e9> DW_AT_name : 字符
<1><28ee>: 缩略号编号: 2 (DW_TAG_variable)
<28ef> DW_AT_name : test2
<28f5> DW_AT_decl_file : 1
<28f6> DW_AT_decl_line : 4
<28f7> DW_AT_decl_column : 13
<28f8> DW_AT_type : <0x28e6>
<28fc> DW_AT_location : 9 字节块: 3 41 70 0 40 1 0 0 0 (DW_OP_addr: 140007041)
<details>
<summary>英文:</summary>
In the end, there seems to be no way to know symbol sizes in a PE without debug info. With debug info, one can extract it with `objdump -W a.exe` and then look up symbol type with `DW_AT_type`, then find out the size of that type with `DW_AT_byte_size`:
<1><28ce>: Abbrev Number: 2 (DW_TAG_variable)
<28cf> DW_AT_name : test1
<28d5> DW_AT_decl_file : 1
<28d6> DW_AT_decl_line : 3
<28d7> DW_AT_decl_column : 13
<28d8> DW_AT_type : <0x28e6>
<28dc> DW_AT_location : 9 byte block: 3 40 70 0 40 1 0 0 0 (DW_OP_addr: 140007040)
<1><28e6>: Abbrev Number: 3 (DW_TAG_base_type)
<28e7> DW_AT_byte_size : 1
<28e8> DW_AT_encoding : 6 (signed char)
<28e9> DW_AT_name : char
<1><28ee>: Abbrev Number: 2 (DW_TAG_variable)
<28ef> DW_AT_name : test2
<28f5> DW_AT_decl_file : 1
<28f6> DW_AT_decl_line : 4
<28f7> DW_AT_decl_column : 13
<28f8> DW_AT_type : <0x28e6>
<28fc> DW_AT_location : 9 byte block: 3 41 70 0 40 1 0 0 0 (DW_OP_addr: 140007041)
</details>
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论