英文:
Understanding a DT_TEXTREL warning
问题
我有以下的代码
我正在将其编译成可执行文件:
生成的可执行文件正确地打印出Hello,然后是一个换行符。
但是在链接时我收到了以下警告信息。
我不明白这是什么意思。我想理解它的含义以及我需要做什么才能摆脱这个消息。
英文:
I have the following code
global main
section .text
main:
mov rax, 1
mov rdi, 1
mov rsi, msg
mov rdx, 6
syscall
mov rax,60
xor rdi,rdi
syscall
section .data
msg:
db "Hello",10,0
I am compiling it to an executable as:
nasm -f elf64 hello.asm
gcc hello.o
The resulting executable correctly prints Hello followed by a newline.
But I get the following warning message while linking.
$ gcc hello.o
/usr/bin/ld: hello.o: warning: relocation in read-only section `.text'
/usr/bin/ld: warning: creating DT_TEXTREL in a PIE
I don't understand what it means. I want to understand what it means and what I have to change to get rid of this message.
答案1
得分: 2
使用RIP相对的LEA指令来消除警告:lea rsi, [rel msg]
。或者使用default rel
,这样lea rsi, [msg]
会被视为[rel msg]
。
现代Linux发行版默认配置GCC以创建PIE(位置无关可执行文件),以便受益于ASLR(在随机基地址加载)。这包括在链接.o
文件时向ld
传递-pie
选项。详见 https://stackoverflow.com/questions/43367427/32-bit-absolute-addresses-no-longer-allowed-in-x86-64-linux 了解更多信息。
在PIE中使用绝对地址需要运行时修复(由动态链接器ld.so
执行),在内核选择加载/映射可执行文件的地址后进行修复。在.text
段(或可能是其他只读段)中,这些称为“文本重定位”(text relocations),DT_TEXTREL
,需要使用mprotect
系统调用来临时将页面设置为可读写。
如果创建一个“静态PIE”(gcc -static-pie
),则需要在_start
中应用重定位(gcc的静态PIE CRT启动代码会这样做),或者如果使用自己的_start
(gcc -static-pie -nostdlib
),则可能根本不会执行重定位。详见 https://stackoverflow.com/questions/62586808/how-are-relocations-supposed-to-work-in-static-pie-binaries
在64位PIE可执行文件或共享库中,像movzx edx, byte [msg + rcx]
(使用寻址方式[rcx+disp32]
)这样的32位绝对地址甚至不可能使用,因为它们需要在虚拟地址空间的任何位置可重定位,而不仅仅是低2 GiB。详见 https://stackoverflow.com/questions/43367427/32-bit-absolute-addresses-no-longer-allowed-in-x86-64-linux
应用文本重定位会使内存页面全部变为不受可执行文件支持(类似于修改的MAP_PRIVATE文件映射,它基本上变成了一个只能分页到交换空间的匿名页面)。而且它会占用用于应用它的元数据空间。
当编译器使用64位绝对地址,例如用于静态指针变量(例如全局的int *const p = &a;
)或指针数组时,它们会将它们放在一个特殊的部分中,以便它们都可以在一起(.section .data.rel.ro.local,"aw"
用于只读,.data.rel.local
用于读写),因此希望只有一个脏页面,以保持.rodata
和.text
干净,以便在运行相同可执行文件或映射相同库的进程之间共享。
这些由编译器生成的绝对地址位于起始时为读写(而不是.text
或.rodata)的页面上,因此不会出现DT_TEXTREL
警告。启动期间仍然会发生应用重定位的相同机制,但不需要首先进行mprotect
调用。.data.rel.ro.local
部分是.data
的子部分,因此起始时是可读写的。我认为在应用重定位后,它会被设置为只读,使用mprotect(PROT_READ)
。
GCC在为switch
创建跳转表时避免了使用绝对地址,而是使用32位相对偏移:https://stackoverflow.com/questions/52190313/gcc-jump-table-initialization-code-generating-movsxd-and-add
此警告存在的目的是帮助编译器开发人员和手写汇编代码的人找到它们在PIE或共享库中意外使用绝对地址的地方(没有将它们放在特殊部分)。或者是使用-fno-pie
为某些文件编译,然后将它们链接到PIE或共享库中的编译器用户,尽管这通常会完全失败,出现错误消息,例如“relocation R_X86_64_32S against '.data' can not be used when making a shared object; recompile with -fPIC
”。(在大多数现代发行版中,GCC的默认配置是使用-fPIE
,但过去可能不是这样,或者您可能已经为某些文件使用了-fno-pie
。DT_TEXTREL
也适用于共享库。)
通常,您不希望在机器代码中包含64位绝对地址,只在数据中包含它们。在64位代码中,mov rsi, msg
不 是一个好方法,除非您要创建一个大于2GiB的大型可执行文件,因此标签距离指令超过+-2GiB(因此RIP相对无法到达,例如使用-mcmodel=large
)。在传统的非PIE可执行文件中,您会希望使用mov esi, msg
(32位绝对地址),但在位置无关代码中,我们能够做的最好的事情是lea rsi, [rel msg]
(RIP + rel32)。
请参见https://stackoverflow.com/questions/57212012/how-to-load-address-of-function-or-label-into-register
(32位模式中不存在RIP相对寻址,因此在32位代码中避免文本重定位不太容易,并且性能开销较大。尤其对于初学者,我建议在链接32位代码时只使用-no-pie
。这也适用于64位代码,如果您想继续使用效率低但“简单”的内容,如mov rsi, msg
并消除警告。)
英文:
Use a RIP-relative LEA to silence the warning: lea rsi, [rel msg]
.
Or default rel
so lea rsi, [msg]
is treated as [rel msg]
https://stackoverflow.com/questions/57212012/how-to-load-address-of-function-or-label-into-register
mov rsi, msg
with a 64-bit absolute address is the worst way; only use it if your executable will be larger than 2GiB (e.g. with huge arrays) so the normal way can't reach.
Modern Linux distros configure GCC to make PIEs (Position Independent Executables) by default so they can benefit from ASLR (load at a random base address). This includes passing -pie
to ld
when linking a .o
. See
https://stackoverflow.com/questions/43367427/32-bit-absolute-addresses-no-longer-allowed-in-x86-64-linux for more.
Using absolute addresses in a PIE requires runtime fixups (done by the dynamic linker ld.so
) after the kernel picks an address to load/map your executable. In the .text
section (or probably other read-only sections), these are called "text relocations", DT_TEXTREL
, and require an mprotect
system call to temporarily make the page read+write.
If you make a "static PIE" (gcc -static-pie
), _start
needs to apply the relocations (and gcc's static-PIE CRT startup code does so), or they won't be done at all if you use your own _start
(gcc -static-pie -nostdlib
) - https://stackoverflow.com/questions/62586808/how-are-relocations-supposed-to-work-in-static-pie-binaries
Using 32-bit absolute addresses like movzx edx, byte [msg + rcx]
(with the addressing mode being [rcx+disp32]
) isn't even possible in 64-bit PIE executables or shared libraries, because they need to be relocatable anywhere in virtual address-space, not just the low 2 GiB: https://stackoverflow.com/questions/43367427/32-bit-absolute-addresses-no-longer-allowed-in-x86-64-linux
Applying a text relocation dirties the whole page of memory so it's not backed by the executable on disk (like a MAP_PRIVATE file mapping that you modify, it becomes basically an anonymous page that could only be paged out to swap space). And it takes up space for metadata to apply it.
When compilers use 64-bit absolute addresse, e.g. for static pointer variables like a global int *const p = &a;
(example on Godbolt) or an array of pointers, they put them in a special section so they're all grouped together (.section .data.rel.ro.local,"aw"
for read-only, .data.rel.local
for read-write), so hopefully only one dirtied page, leaving .rodata
and .text
clean so it can be shared between processes running the same executable or mapping the same library.
Those compiler-generated absolute addresses are in pages that start out read+write (not .text
or .rodata) so you don't get a DT_TEXTREL
warning. The same mechanism of applying relocations during startup still happens, but without the mprotect
call first. The .data.rel.ro.local
section is a subsection of .data
so it starts out read+write. I think it gets made read-only after applying relocations, with mprotect(PROT_READ)
.
GCC does avoid absolute addresses when inventing jump tables for switch
by using 32-bit relative offsets: https://stackoverflow.com/questions/52190313/gcc-jump-table-initialization-code-generating-movsxd-and-add
This warning exists to help compiler developers and people writing asm by hand find places where they've accidentally used absolute addresses in a PIE or shared library (without putting them in a special section).
Or users of compilers who build some files with -fno-pie
and then linked them into a PIE or shared library, although that will more usually fail entirely with an error like relocation R_X86_64_32S against `.data' can not be used when making a shared object; recompile with -fPIC
. (On most modern distros, GCC is configured with -fPIE
as the default, but that didn't used to be the case, or you might have used -fno-pie
for some files. DT_TEXTREL
also applies to shared libraries.)
You normally don't want 64-bit absolute addresses as part of your machine code in the first place, only as data. mov rsi, msg
is not a good way to do things in 64-bit code unless you're making a huge executable that's larger than 2GiB, so the label is more than +-2GiB away from the instruction (so RIP-relative couldn't reach, like -mcmodel=large
) In a traditional non-PIE executable you'd want mov esi, msg
(32-bit absolute), but the best we can do in position-independent code is lea rsi, [rel msg]
(RIP + rel32).
See https://stackoverflow.com/questions/57212012/how-to-load-address-of-function-or-label-into-register
(RIP-relative addressing doesn't exist in 32-bit mode, so avoiding text relocations in 32-bit code is less trivial and has a performance cost. For beginners especially I'd recommend just using -no-pie
when linking for 32-bit code. That also works for 64-bit code if you want to keep using inefficient but "simple" stuff like mov rsi, msg
and silence the warning.)
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论