2023年5月14日 22:39:59go评论89阅读模式

英文:

x86 LEA instruction doing ambiguous things

问题

以下是翻译好的部分：

这里是C代码：

int baz(int a, int b)
{
    return a * 11;
}

它被编译成以下一组汇编指令（带有-O2标志）：

baz(int, int):
        lea     eax, [rdi+rdi*4]
        lea     eax, [rdi+rax*2]
        ret

lea 指令计算第二个操作数（源操作数）的有效地址，并将其存储在第一个操作数中。对我来说，第一条指令似乎应该将地址加载到EAX寄存器中，但如果是这样的话，第二条lea指令中将RAX乘以2就没有意义了，因此我推断这两条lea指令并不完全做相同的事情。

我想知道是否有人可以澄清这里究竟发生了什么。

英文:

Here's the C code:

int baz(int a, int b)
{
    return a * 11;
}

That is compiled to the following set of assembly instructions (with -O2 flag):

baz(int, int):
        lea     eax, [rdi+rdi*4]
        lea     eax, [rdi+rax*2]
        ret

The lea instruction computes the effective address of the second operand (the source operand) and stores it in the first operand. To me, it seems that the first instruction should load an address to the EAX register, but, if so, multiplying RAX by 2 does not make sense in the second lea instruction, so I infer that these two lea instructions do not do quite the same thing.

I was wondering if someone could clarify what exactly is happening here.

答案1

得分: 6

The function argument for a is stored in rdi. There is no need to load anything from memory.

lea eax, [rdi+rdi*4] is not calculating the address for any memory location to retrieve data from. Instead, the compiler is just repurposing the instruction to do multiplication. It stores a + a*4 in eax. Let's call that value t.

lea eax, [rdi+rax*2] effectively stores a + t*2 in eax.

rax is also the register in which the function's return value is returned.

So the return value will be a + t*2, which is a + (a + a*4)*2, which is a + a*5*2, which is a*11.

英文:

The function argument for a is stored in rdi. There is no need to load anything from memory.

lea eax, [rdi+rdi*4] is not calculating the address for any memory location to retrieve data from. Instead the compiler is just repurposing the instruction to do a multiplication. It stores a + a*4 to eax. Let's call that value t.

lea eax, [rdi+rax*2] then effectively stores a + t*2 to eax.

rax is also the register in which the function's return value is returned.

So the return value will be a + t*2 which is a + (a + a*4)*2 which is a + a*5*2 which is a*11.

答案2

得分: 4

Linux使用System V AMD64 ABI调用约定，将第一个整数参数传递给寄存器RDI，返回值在RAX中。这里使用EAX就足够了，因为它返回一个32位的值。第二个参数未使用。

LEA 最初用于8086处理器的地址计算，但也用于带有常数因子的整数运算，这是这里的情况。常数因子使用指令编码中SIB字节的比例值进行编码。它可以是1、2、4或8。

因此，代码可以解释为：

baz(RDI, RSI):            ; a, b
lea     eax, [rdi+rdi*4]  ; RAX = 1*a + 4*a   = 5*a
lea     eax, [rdi+rax*2]  ; RAX = 1*a + 2*RAX = 1*a + 2*(5*a)
ret                       ; 返回 RAX/EAX = 11*a

RAX的高半部分（64位值）会在第一个LEA中自动清除，参见此SO问题。

英文:

Linux uses the System V AMD64 ABI calling convention which passes the first integer parameter in the register RDI and the return value in RAX. Here EAX is sufficient, because it returns a 32-bit value. The second parameter is unused.

LEA was intended for address calculations first on 8086 processors, but is also used for integer arithmetic with a constant factor, which is the case here. The constant factor is encoded using the scale value of the SIB byte in the instruction encoding. It can be 1,2,4 or 8.

So, the code could be explained by

baz(RDI, RSI):            ; a, b
lea     eax, [rdi+rdi*4]  ; RAX = 1*a + 4*a   = 5*a
lea     eax, [rdi+rax*2]  ; RAX = 1*a + 2*RAX = 1*a + 2*(5*a)
ret                       ; return RAX/EAX = 11*a

The upper half of RAX(64-bit value) is automatically cleared by the first LEA, see this SO question.

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

x86 LEA指令执行模糊操作。

问题

答案1

答案2

结构体的typedef最佳实践？

如何打印第一个计数数字？

如何在C中创建一个输入出生日期的函数。

显示带重音的字符 C/C++？

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。