英文:
Unexpected memory usage behaviour when using static linking in C
问题
1、为什么我的程序在top命令的输出中显示了SHR(共享内存)的使用情况?(我预期这一列的值为0。)
2、在启动use_st_lib_0后,RES(常驻内存)显示使用了9.5MB,但top命令顶部显示的内存使用情况没有增加。(我预期它会从133.2MB增加到123.7MB。)
3、类似地,在启动use_st_lib_1后,top命令的输出几乎没有变化。(我预期它会从123.7MB增加到114.2MB。)
4、在上述步骤中,我在哪里犯了错误,或者我的理解可能出现了什么偏差?
以上是我进行的尝试和我预期的结果。
英文:
I am studying static linking and dynamic linking, and my understanding is that static linking consumes more memory and disk space compared to dynamic linking. It is evident that static linking occupies more disk space because the static libraries are compiled into each executable file. But how can I verify that static linking consumes more memory space? For this purpose, I have designed a small experiment as described below.
Step 1: Create a C file named "mylib.c"
//mylib.c
#include <unistd.h>
// This function is designed to create a massive .text section.
void many_code() {
asm volatile (
"movabs $0x1122334455667788, %%rax \n\t"
"movabs $0x1122334455667788, %%rax \n\t"
"movabs $0x1122334455667788, %%rax \n\t" <----- This line is repeated one million times.
...
::: "rax"
);
sleep(-1);
}
Step 2: Execute "gcc -c mylib.c -o mylib.o" to compile the source file into an object file.
Step 3: Create a static library by using the command "ar -r libmy.a mylib.o".
Step 4: Create a second C file named "use_st_lib.c".
// use_st_lib.c
extern void many_code();
int main() {
many_code();
return 0;
}
Step 5: Create an executable file "use_st_lib_0.out" by using static linking with the command "gcc use_st_lib.c -static libmy.a -o use_st_lib_0".
Step 6: Create a second executable file by using static linking with the command "gcc use_st_lib.c -static libmy.a -o use_st_lib_1".
➜ static git:(master) ✗ ls -hl
total 91M
-rw-r--r-- 1 root root 9.6M May 29 09:37 libmy.a
-rw-r--r-- 1 root root 51M May 27 23:33 mylib.c
-rw-r--r-- 1 root root 9.6M May 29 09:37 mylib.o
-rwxr-xr-x 1 root root 11M May 29 10:11 use_st_lib_0
-rwxr-xr-x 1 root root 11M May 29 10:12 use_st_lib_1
-rw-r--r-- 1 root root 76 May 27 22:10 use_st_lib.c
Step 7: In another terminal session, use the "top" command and set the refresh time to 1.0s. Apply a filter condition with COMMAND=use_st_lib.
top - 10:21:59 up 15 days, 17:40, 4 users, load average: 0.00, 0.01, 0.05
Tasks: 79 total, 1 running, 78 sleeping, 0 stopped, 0 zombie
%Cpu(s): 1.0 us, 1.0 sy, 0.0 ni, 98.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
MiB Mem : 1756.5 total, 793.8 free, 133.2 used, 829.5 buff/cache
MiB Swap: 0.0 total, 0.0 free, 0.0 used. 1469.8 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
Step 8: Run the process "use_st_lib_0" in the background by using the command "./use_st_lib_0 &".
➜ static git:(master) ✗ ./use_st_lib_0 &
[1] 31239
top - 10:31:06 up 15 days, 17:49, 4 users, load average: 0.07, 0.06, 0.06
Tasks: 80 total, 1 running, 79 sleeping, 0 stopped, 0 zombie
%Cpu(s): 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
MiB Mem : 1756.5 total, 793.6 free, 133.1 used, 829.8 buff/cache
MiB Swap: 0.0 total, 0.0 free, 0.0 used. 1469.9 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
31239 root 25 5 10.6m 9.5m 9.5m S 0.0 0.5 0:00.00 use_st_lib_0
Step 9: Run the process "use_st_lib_1" in the background by using the command "./use_st_lib_1 &".
➜ static git:(master) ✗ ./use_st_lib_1 &
[2] 31309
top - 10:32:02 up 15 days, 17:50, 4 users, load average: 0.03, 0.05, 0.05
Tasks: 81 total, 1 running, 80 sleeping, 0 stopped, 0 zombie
%Cpu(s): 0.0 us, 1.0 sy, 0.0 ni, 99.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
MiB Mem : 1756.5 total, 793.3 free, 133.4 used, 829.9 buff/cache
MiB Swap: 0.0 total, 0.0 free, 0.0 used. 1469.6 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
31309 root 25 5 10.6m 9.5m 9.5m S 0.0 0.5 0:00.00 use_st_lib_1
31239 root 25 5 10.6m 9.5m 9.5m S 0.0 0.5 0:00.00 use_st_lib_0
Here are my questions:
1、Why does my program show SHR (Shared Memory) usage in the output of the top command? (I expected this column to be 0.)
2、After starting use_st_lib_0, the RES (Resident Memory) shows a usage of 9.5m, but the memory usage displayed at the top of the top command does not increase. (I expected it to change from 133.2 used to 123.7 used.)
3、Similarly, after starting use_st_lib_1, the output of the top command shows little or no change. (I expected it to change from 123.7 used to 114.2 used.)
4、In the above steps, where did I make mistakes, or what kind of deviation might have occurred in my understanding?
Above are the attempts I made and the results I expected.
答案1
得分: 2
我的理解是,静态链接比动态链接消耗更多的内存和磁盘空间。
这种理解是非常不完整的。
如果你只有一个./a.out
二进制文件,那么这个二进制文件通常在完全静态链接时会消耗更少的内存和磁盘空间。
这是因为:
- 只有实际引用的代码和数据被链接进来(与此相反,共享库必须链接_所有_内容,无论是否被使用)。
- 不需要空间来存储"动态链接支持表"(
PLT
和GOT
),也不需要动态重定位。这些可能会占用_大量_空间。
另一方面,如果你有多个二进制文件都使用_相同的_共享库,并且你同时运行所有这些二进制文件,那么它们使用的总内存和磁盘空间通常会比每个二进制文件完全静态链接时要小。
这是因为你不需要在每个可执行文件的一部分中复制库,并且你不需要将该库的代码多次加载到内存中。
另一个复杂性是Linux使用_需求分页_,这意味着你的可执行文件在实际运行其中的所有代码、访问数据等之前不会占用太多的RAM内存。
附注:
通常情况下,使用top
来记录进程的内存不是一个好方法。它的输出可能会非常误导。你应该仔细阅读man top
。
在Linux上,你实际上可以通过查看/proc/$pid/pagemap
来记录每个物理内存页面(如果你的内核配置了CONFIG_PROC_PAGE_MONITOR
)。
英文:
> my understanding is that static linking consumes more memory and disk space compared to dynamic linking.
That understanding is grossly incomplete.
If you have a single ./a.out
binary, that binary would generally consume less memory and less disk space when fully statically linked.
This is because:
- only code and data that is actually referenced is linked in (in contrast, shared library must link everything in, whether used or not).
- no space is needed for the "dynamic linking support tables" (
PLT
andGOT
), nor for dynamic relocations. These can consume a lot of space.
OTOH, if you have multiple binaries which all use the same shared library, and you run all these binaries at the same time, then the total memory and disk space used by them would generally be smaller with dynamic linking, than if each binary was fully statically linked.
That is because you don't need to make a copy of the library as part of each executable, and you don't need to load the code for that library into memory more that once.
Another complication is that Linux uses demand paging, which means that your executable will not consume much RAM unless it actually runs all the code in it, accesses the data, etc.
P.S.
Using top
is generally not a good way to account for memory of the process. Its output could be very misleading. You should read man top
carefully.
On Linux you can actually account for every page of physical memory by looking in /proc/$pid/pagemap
(if your kernel is configured with CONFIG_PROC_PAGE_MONITOR
).
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论