2023年3月9日 18:29:20go评论158阅读模式

英文:

x86 code to read in input into a local variable?

问题

我是新来的，对x86汇编不太熟悉，已经阅读了这篇帖子，讨论了在NASM x86中使用scanf的方法。这篇帖子非常有启发性，但大多数答案都是使用在.bss部分中进行resd的全局变量。相反，我希望像以下C程序一样将scanf输入到一个局部变量中：

#include <stdio.h>
int main(void) {
    int x;
    scanf("%d", &x);
    return x;
}

x86汇编代码如下：

; demo.asm
global main
extern scanf
section .data
fmt: db "%d", 0x00
section .text
main:
    ; 函数前导
    push ebp
    mov ebp, esp
    ; 函数体
        ; 分配一个整数x，[ebp - 4] = 第一个局部变量
        sub esp, 4
        ; eax <- &x
        lea eax, [ebp - 4]
        ; 传递参数
        push eax ; 将&x推送到栈上（4字节地址）
        push fmt ; 将格式字符串推送到栈上（4字节地址）
        ; 调用函数
        call scanf
        ; 清理（4 + 4字节）的栈
        add esp, 8
        ; 将从用户那里读取的数字返回到shell
        mov eax, [ebp - 4]
    ; 函数尾声
    mov esp, ebp ; 多余的吗？
    pop ebp
    ret

我遵循了CDECL调用约定，也使用了C 32位运行时以及C 32位标准库进行链接。

$ nasm -f elf32 demo.asm
$ gcc -m32 -o demo demo.o
./demo
10
$ echo $?
10

这个实现是有效的，但我担心在更大的代码中会产生致命错误。这种实现是否存在风险？

英文:

I am new to x86 assembly and have gone through this post discussing the use of scanf in NASM x86. It was quite enlightening, however most of the answers used a global variable by doing resd in the .bss section. Instead, I want to scanf into a local variable as if by the following C program:

  #include &lt;stdio.h&gt;
  int main(void) {
      int x;
      scanf(&quot;%d&quot;, &amp;x);
      return x;
  }

x86 code:

; demo.asm
global main
extern scanf
section .data
fmt: db &quot;%d&quot;, 0x00
section .text
main:
    ; prologue
    push ebp
    mov ebp, esp
    ; body
        ; allocate int x, [ebp - 4] = first local variable 
        sub esp, 4
        ; eax &lt;- &amp;x
        lea eax, [ebp - 4]
        ; pass parameters
        push eax ; push &amp;x on stack (4 byte address)
        push fmt ; push format string on stack (4 byte address)
        ; call function
        call scanf
        ; clean (4 + 4 bytes) stack
        add esp, 8
        ; return number read from user to shell
        mov eax, [ebp - 4]
    ; epilogue
    mov esp, ebp ; redundant?
    pop ebp
    ret

I followed the CDECL calling convention and I am using the C 32-bit runtime as well as the C 32-bit standard library for linking.

$ nasm -f elf32 demo.asm
$ gcc -m32 -o demo demo.o
./demo
10
$ echo $?
10

This works, but I am worried about producing fatal errors in larger code. Is that a risk with this implementation?

答案1

得分: 7

以下是翻译的内容：

"Looks correct except for stack alignment. I'm guessing you're on Linux, given the nasm -f elf32." - "看起来正确，除了栈对齐。我猜你在使用Linux，因为你使用了nasm -f elf32。"
"Thus you're using the calling convention documented in the i386 System V ABI. It happens to be essentially the same as MS Windows CDECL, and some people call it that, but it differs from Windows in how 64-bit structs are returned, for example (in memory on i386 SysV vs. in EDX:EAX in Windows cdecl.) If you say 'cdecl' to describe your calling convention, with the context being Linux, most people will know that you mean i386 SysV as opposed to 'gcc -mregparm=3' or something, but IMO 'i386 SysV' is nearly as short and much more specific/accurate as a description." - "因此，你正在使用i386 System V ABI中记录的调用约定。它实际上与MS Windows CDECL基本相同，有些人也这样称呼，但与Windows不同的是，例如，如何返回64位结构体（在i386 SysV中存储在内存中，而在Windows cdecl中存储在EDX：EAX中）。如果你在Linux上提到'cdecl'来描述你的调用约定，大多数人会知道你指的是i386 SysV，而不是'gcc -mregparm=3'或其他什么，但在我看来，'i386 SysV'几乎同样简洁，并更具体/准确作为描述。"
"The i386 System V ABI requires 16-byte stack alignment before a call, thus ESP % 16 == 12 on function entry after a call pushes a return address. This applies in the ABI version used on modern Linux, due to GCC's accidental reliance on -mpreferred-stack-boundary=4, which in 32-bit mode was supposed to be optional/best-effort to help performance. Documenting as a new requirement was the least-bad way out of the situation with binaries in the wild that would crash with misaligned ESP. Many other OSes using i386 System V, such as MacOS and FreeBSD, didn't adopt that change, and hand-written asm that only maintains 4-byte stack alignment is still correct there." - "i386 System V ABI要求在调用之前进行16字节的栈对齐，因此在调用后，函数进入时ESP % 16 == 12，推送了一个返回地址。这适用于现代Linux上使用的ABI版本，由于GCC意外地依赖于-mpreferred-stack-boundary=4，这在32位模式下本应是可选的/尽力提高性能的。将其作为新要求记录是在二进制文件可能因ESP未对齐而崩溃的情况下的最佳解决方式。许多其他使用i386 System V的操作系统，如MacOS和FreeBSD，没有采纳这个更改，手写的汇编代码仅保持4字节的栈对齐仍然是正确的。"
"Most of 32-bit library functions don't actually depend on that (e.g. they don't use movaps for 16-byte copies for locals in stack space the way many 64-bit functions do; see https://stackoverflow.com/q/51070716). So in 32-bit code, it's common that it will happen to work anyway, but could break in some future Linux distro, exactly the kind of invisible bug you were asking about. Assembly language is not one where trial and error can prove your code is safe and correct." - "大多数32位库函数实际上并不依赖于这一点（例如，它们不像许多64位函数那样在堆栈空间中对本地变量使用movaps进行16字节的复制；请参考https://stackoverflow.com/q/51070716）。因此，在32位代码中，通常情况下它仍然可以正常工作，但在未来的Linux发行版中可能会出现问题，这正是你所提到的那种看不见的错误。汇编语言不是通过试错来证明你的代码是安全和正确的地方。"
"You have 16 bytes of stack adjustment before the call, rather than 12 + 16n. (4 each from push ebp and sub esp,4, then 2 more pushes of args for scanf). You could drop the use of EBP as a frame pointer (and adjust the 2 instructions that referenced stack space to use ESP instead), or sub esp,16 instead of 4." - "在调用之前，你有16字节的栈调整，而不是12 + 16n（4字节来自push ebp和sub esp,4，然后再推送2个参数供scanf使用）。你可以取消使用EBP作为帧指针（并调整引用栈空间的2个指令以使用ESP），或者用sub esp,16代替4。"
"You might be tempted to ask scanf to overwrite one of its args with the conversion result. That would probably be safe in practice, especially the int * since it needs the pointer in a register to store through it, and it has no reason to read it again after. But functions own their incoming stack args and can re-read them any number of times, and assume that no pointers alias them. scanf could in theory be reusing its incoming arg space for its own temporaries, or more plausibly, as args for a tail call; reusing arg space that way is something real compilers do." - "你可能会想要让scanf用转换结果覆盖其中一个参数。在实践中，这可能是安全的，尤其是int *，因为它需要在寄存器中存储指针，并且在之后没有再次读取它的原因。但函数拥有它们的传入栈参数，并且可以多次重新读取它们，并假设没有指针别名。scanf理论上可以重用其传入参数空间作为自己的临时空间，或者更有可能是作为尾调用的参数；实际编译器会以这种方式重用参数空间。"
"mov esp, ebp is not redundant per se; ESP is

英文:

Looks correct except for stack alignment. I'm guessing you're on Linux, given the nasm -f elf32.

Thus you're using the calling convention documented in the i386 System V ABI. It happens to be essentially the same as MS Windows CDECL, and some people call it that, but it differs from Windows in how 64-bit structs are returned for example (in memory on i386 SysV vs. in EDX:EAX in Windows cdecl.) If you say "cdecl" to describe your calling convention, with the context being Linux, most people will know that you mean i386 SysV as opposed to gcc -mregparm=3 or something, but IMO "i386 SysV" is nearly as short and much more specific / accurate as a description.

The i386 System V ABI requires 16-byte stack alignment before a call, thus ESP % 16 == 12 on function entry after call pushes a return address. (This applies in the ABI version used on modern Linux, due to GCC's accidental reliance -mpreferred-stack-boundary=4 which in 32-bit mode was supposed to be optional / best-effort to help performance. Documenting as a new requirement was the least-bad way out of the situation with binaries in the wild that would crash with misaligned ESP. Many other OSes using i386 System V, such as MacOS and FreeBSD, didn't adopt that change, and hand-written asm that only maintains 4-byte stack alignment is still correct there.)

Most of 32-bit library functions don't actually depend on that (e.g. they don't use movaps for 16-byte copies for locals in stack space the way many 64-bit functions do; see https://stackoverflow.com/q/51070716). So in 32-bit code it's common that it will happen to work anyway, but could break in some future Linux distro, exactly the kind of invisible bug you were asking about. Assembly language is not one where trial and error can prove your code is safe and correct.

You have 16 bytes of stack adjustment before the call, rather than 12 + 16*n. (4 each from push ebp and sub esp,4, then 2 more pushes of args for scanf). You could drop the use of EBP as a frame pointer (and adjust the 2 instructions that referenced stack space to use ESP instead), or sub esp,16 instead of 4.

You might be tempted to ask scanf to overwrite one of its args with the conversion result. That would probably be safe in practice, especially the int * since it needs the pointer in a register to store through it, and it has no reason to read it again after. But functions own their incoming stack args, and can re-read them any number of times, and assume that no pointers alias them. scanf could in theory be reusing its incoming arg space for its own temporaries. Or more plausibly, as args for a tailcall; reusing arg space that way is something real compilers do.

mov esp, ebp is not redundant per-se; ESP is still 4 bytes below EBP at that point, as you can see from single-stepping your code with gdb and watching registers change with layout reg. To avoid needing it, you'd have to change other instructions. e.g. mov eax, [ebp - 4] could be pop eax.

It would not be safe to change add esp,8 to add esp,12 unless you move it after the mov eax, [ebp - 4]. Any space below ESP can be trashed asynchronously, e.g. by a signal handler if you had one. Only x86-64 System V has a red-zone below the stack pointer that's safe from that.

There is one instruction in your code that's fully redundant, add esp, 8. You're about to reset ESP to EBP, popping all stack allocations including the arg space. (If you'd been making multiple function calls, you could let some args accumulate or reusing their space with mov stores instead of pushes and pops. But push and pop are good for machine code density.)

The format string doesn't need to be in read-write .data; it can be in .rodata. (Or you could push-immediate and push esp, but it's normal to just put strings in .rodata even when they're tiny.)

If I was writing this for efficiency and minimalism, not doing stuff that isn't required or even useful for this specific function, but not aiming for simplicity and easiest to understand, I might have written this:

extern scanf
section .rodata
fmt: db &quot;%d&quot;, 0       ; style: plain 0 seems appropriate to me as a terminator byte
section .text
global main           ; I prefer putting global next to the label, like in C how  static is with a function definition, not at the top of the file, but either is valid
main:
  ; ESP % 16 == 12 on entry
    push  -1       ; allocate int x, with a value in case the scanf conversion fails
    push  esp      ; &amp;x.  push esp snapshots the input register before the -= 4 part of push
    push  fmt      ; push format string on stack (4 byte address)
    call scanf     ; 3x 4-byte pushes, ESP % 16 == 0 ready for a call
    ; TODO: check EAX==1 to make sure the conversion succeeded.
    add  esp, 8        ; clean the args from the stack
    pop  eax           ; load the %d conversion result.  (Or the value we pushed earlier if scanf didn&#39;t write it)
    ; epilogue
    ; nothing to do here, we didn&#39;t save any call-preserved registers
    ret

See https://stackoverflow.com/questions/14968824/what-is-an-assembly-level-representation-of-pushl-popl-esp/69489798#69489798 re: the details of what happens when you push esp: it pushes the old value of ESP, snapshotted before the esp-=4 effect of the push itself.

BTW, this is borderline code-review. Code review questions are off-topic on SO these days, but there is https://codereview.stackexchange.com/. This is so short and doing so little that I think it's ok as an SO question.

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

x86代码读取输入并存入本地变量中？

问题

答案1

sigsuspend导致第二个子进程无法接收C中的管道消息

array subscript is of type ‘char’ warning not sure how to clear

加速嵌套循环绘制

C程序函数ctime()返回的时间/日期不正确 – 日期偏差3天。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。