英文:
How to preserve the register I touch?
问题
这个任务是“创建一个HLA汇编语言程序,提示用户输入三个整数。创建并调用一个函数,该函数返回DX中最小的参数值。为了获得满分,返回到调用者后,你的函数不应改变除DX之外的任何寄存器的值。”
我的教授说:“我的代码没有保留它所触及的寄存器。”
我不知道该怎么做。
有人可以帮忙吗?
是关于push和pop吗?
program Smallest_Number;
#include( "stdlib.hhf" );
static
value1 : int16;
value2 : int16;
value3 : int16;
procedure smallest( var value1 : int16; var value2 : int16; var value3 : int16 ); @nodisplay; @noframe;
static
dReturnAddress : dword;
begin smallest;
pop( dReturnAddress );
pop( AX );
pop( BX );
pop( CX );
push( dReturnAddress );
mov(AX, DX);
cmp(DX, BX);
jl check_1;
mov(BX, DX);
check_1:
cmp (DX, CX);
jl end_smallest;
jmp update_1;
update_1:
mov (CX, DX);
jmp end_smallest;
end_smallest:
stdout.put( "The smallest value is " );
stdout.puti16( DX );
ret( ) ;
end smallest;
begin Smallest_Number;
stdout.put( "Provide value1:" );
stdin.get(value1);
stdout.put( "Provide value2:" );
stdin.get(value2);
stdout.put( "Provide value3:" );
stdin.get(value3);
mov(value1,AX);
mov(value2,BX);
mov(value3,CX);
push(CX);
push(BX);
push(AX);
call smallest;
end Smallest_Number;
我不知道该怎么做。
英文:
This assignment is "Create an HLA Assembly language program that prompts for three integers from the user. Create and call a function that returns in DX the value of the parameter that is the smallest of the three. In order to receive full credit, after returning back to the caller, your function should not change the value of any register other than DX"
My professor said "My Code Is Not Preserving The Registers It Touches."
I do not know how to do it.
Could anyone help?
Is it about push and pop?
program Smallest_Number;
#include( "stdlib.hhf" );
static
value1 : int16;
value2 : int16;
value3 : int16;
procedure smallest( var value1 : int16; var value2 : int16; var value3 : int16 ); @nodisplay; @noframe;
static
dReturnAddress : dword;
begin smallest;
pop( dReturnAddress );
pop( AX );
pop( BX );
pop( CX );
push( dReturnAddress );
mov(AX, DX);
cmp(DX, BX);
jl check_1;
mov(BX, DX);
check_1:
cmp (DX, CX);
jl end_smallest;
jmp update_1;
update_1:
mov (CX, DX);
jmp end_smallest;
end_smallest:
stdout.put( "The smallest value is " );
stdout.puti16( DX );
ret( ) ;
end smallest;
begin Smallest_Number;
stdout.put( "Provide value1:" );
stdin.get(value1);
stdout.put( "Provide value2:" );
stdin.get(value2);
stdout.put( "Provide value3:" );
stdin.get(value3);
mov(value1,AX);
mov(value2,BX);
mov(value3,CX);
push(CX);
push(BX);
push(AX);
call smallest;
end Smallest_Number;
I have no idea how to do it
答案1
得分: 3
我的教授说:“我的代码没有保留它所使用的寄存器。”我不知道该怎么做。有人可以帮忙吗?是关于push和pop吗?
是的!考虑到这是一个作业,我不会对你的代码进行更改,但我相信你的老师所说的“保留它所使用的寄存器”是指这个意思。
你可以阅读一些关于调用约定的内容,我相信现在大部分ABI都已经标准化了,但仍然有很多可以选择的。
为了帮助你找到解决方法,考虑以下的C代码:
#include <stdio.h>
int a = 0;
void my_cool_function (int b) {
a += b + 1;
}
int main () {
int a;
a = 0;
printf("%d", a);
int b;
b = 1;
my_cool_function(b);
printf("%d", a);
}
你可能已经知道,main
函数中的变量a
与全局变量a
完全不同,这是由于作用域规则导致的。
一个有趣的问题是:如果这些变量是寄存器,我们能使它们的行为相同吗?也就是说,如果上面程序中的a
不是一个抽象的容器,而是一个寄存器的名称。
如果是这样的话,程序的行为可能会有所不同,对吗?但我们绝对可以避免这种情况,类似于对具有相同名称的变量所做的操作。我将使用这个不太酷的求和作为一个简单的案例研究:
int my_uncool_sum (int b, int c) {
return b + c + 1;
}
我在我的ARM机器上编译了它,没有进行任何优化,以免编译器干扰。以下是我得到的反汇编结果:
[0x0] <+0>: sub sp, sp, #0x10
[0x4] <+4>: str w0, [sp, #0xc]
[0x8] <+8>: str w1, [sp, #0x8]
[0xc] <+12>: ldr w8, [sp, #0xc]
[0x10] <+16>: ldr w9, [sp, #0x8]
[0x14] <+20>: add w8, w8, w9
[0x18] <+24>: add w0, w8, #0x1
[0x1c] <+28>: add sp, sp, #0x10
[0x20] <+32>: ret
在这个体系结构中,int类型占据4个字节,目标寄存器总是指令的第一个参数。知道了这一点,第一条指令sub sp, sp, 0x10
和倒数第二条指令add sp, sp, 0x10
之间存在对称性,对吗?在这里使用的寄存器是栈指针寄存器,而那个值0x10
是减去和添加的字节数,恰好足够容纳两个整数(对于这个体系结构来说,8个字节)1。请记住:栈是从高地址向低地址增长的。
与你的代码类似,这个函数也执行了一种类似于pop指令的操作来获取它的参数。对于你来说,这是pop(AX)
。这个pop指令从栈中获取一个值,并将其放入寄存器AX
中。之前那里是什么呢?无论如何,让我们继续。
ARM不允许直接寻址,所以需要进行一些操作。首先,我们获取我们想要的地址:str w0, [sp, #0xc]
。现在,w0
保存了我们第一个参数的地址。
然后,我们获取该内存位置指向的实际值:ldr w8, [sp, #0xc]
(编译器选择不使用w0
)。现在,w8
保存了我们第一个参数的值。对于第二个参数,同样的过程发生在寄存器对w1
和w9
上。我们将它们相加,并在返回之前增加栈指针。
还有一件事没有解决:如何调用这个函数呢?
下面的代码片段是整个模块,包括对函数my_uncool_sum
的定义和调用:
int my_uncool_sum (int b, int c) {
return b + c + 1;
}
int uncool_sum_usage() {
volatile register int busy = 0xFFF;
return busy + my_uncool_sum(1, 2);
}
请注意,在uncool_sum_usage
中,我在busy
上使用了0xFFF
(十六进制)作为一个寄存器中的值。以下是uncool_sum_usage
的反汇编结果:
[0x24] <+0>: sub sp, sp, #0x20
[0x28] <+4>: stp x29, x30, [sp, #0x10]
[0x2c] <+8>: add x29, sp, #0x10
[0x30] <+12>: mov w8, #0xfff
[0x34] <+16>: stur w8, [x29, #-0x4]
[0x38] <+20>: ldur w8, [x29, #-0x4]
[0x3c] <+24>: str w8, [sp, #0x8]
[0x40] <+28>: mov w0, #0x1
[0x44] <+32>: mov w1, #0x2
[0x48] <+36>: bl 0x48 ; <+36> at main.c:7:17
[0x4c] <+40>: ldr w8, [sp, #0x8]
[0x50] <+44>: add w0, w8, w0
[0x54] <+48>: ldp x29, x30, [sp, #0x10]
[0x58] <+52>: add sp, sp, #0x20
[0x5c] <+56>: ret
正如第四条指令告诉我们的那样,0xfff
就在w8
中。紧随其后的是一个stur
(存储寄存器)指令。这个stur
使用的是x29 - 0x4
。x29
是另一个通用寄存器,0x4
又一次恰好足够容纳一个int[2]。请注意,x29
在第三条指令add x29, sp, #0x10
中刚刚与栈指针一起出现过[3]。在调用函数之后,我们需要将0xFFF
恢复,为了返回,我们再次使用栈指针ldr w8, [sp, #0x8]
。
即使我们进行了可能会改变寄存器的函数调用,0xFFF
仍然被保存和恢复了。调用约定对于保持这种行为的一致性非常重要。你可以思考一下栈在函数调用中的作用是什么?在这个示例中,它是如何使用的?它与调用约定有什么关系?当跳转到可能随意更改寄存器的代码时,我们如何确保保留先前的信息?
最后注意:如果你想在另一种体系结构中复制这个过程(也许使用不同的约定?),代码是使用-stc=c99 -O0 -g
编译的。我使用lldb
获取了反汇编结果。如果目标文件命名为a.out
(默认情况下),你可以通过运行lldb a.out
并发送disassemble -n uncool_sum_usage
来获取函数的反汇编结果。
英文:
> My professor said "My Code Is Not Preserving The Registers It Touches." I do not know how to do it. Could anyone help? Is it about push and pop?
Yes! Given this is an assignment, I won't make changes to your code, but I believe that's what your teacher means by "preserving registers it touches."
Something that might benefit you to read on are calling conventions. I believe ABI's have that mostly standardized nowadays, but there's still plenty around to choose from.
Doing my best to help you find your way, consider the following C code:
#include <stdio.h>
int a = 0;
void my_cool_function (int b) {
a += b + 1;
}
int main () {
int a;
a = 0;
printf("%d", a);
int b;
b = 1;
my_cool_function(b);
printf("%d", a);
}
As you may already know, the a
variable inside main
is completely different from the a
global variable due to scoping rules.
An interesting question would be: could we make it behave the same if these were registers? That is, if a
in the program above wasn't an abstract container, but rather the name of a register.
Were that the case, the program could behave differently, right? But we can definitely avoid that case somewhat similarly to what is done to variables with the same name. I'll use this uncool sum as a simpler case-study:
int my_uncool_sum (int b, int c) {
return b + c + 1;
}
I've compiled it on my arm machine with zero optimizations so the compiler doesn't get in the way. Here is the disassembly dump that I got:
[0x0] <+0>: sub sp, sp, #0x10
[0x4] <+4>: str w0, [sp, #0xc]
[0x8] <+8>: str w1, [sp, #0x8]
[0xc] <+12>: ldr w8, [sp, #0xc]
[0x10] <+16>: ldr w9, [sp, #0x8]
[0x14] <+20>: add w8, w8, w9
[0x18] <+24>: add w0, w8, #0x1
[0x1c] <+28>: add sp, sp, #0x10
[0x20] <+32>: ret
In this architecture, ints are 4 bytes long and the destination register is always the instruction's first argument. Knowing that, there's a symmetry between the first instruction sub sp, sp, 0x10
and the penultimate instruction add sp, sp, 0x10
, right? The register being used there is the Stack Pointer register and that value, 0x10
is subtracting and adding just enough bytes to fit two integers (for this architecture, 8 bytes)<sup>1</sup>. Remember: the stack grows from high address to low address.
Similar to your code, this function is also executing a kind of pop instruction to get its arguments. For you, that was, e.g., pop( AX );
. That pop instruction gets a value from the stack and puts it into the register AX
. What was there before? Anyway, let's keep going.
ARM doesn't allow you to do direct addressing, so a little dance is necessary. First, we get the address of what we want: str w0, [sp, #0xc]
. w0
know holds the address of our first parameter.
Then, we get the actual value that memory location points to: ldr w8, [sp, #0xc]
(the compiler chose not to use w0
for some reason). w8
now holds the value of our first parameter. The same process happens for the pair of register w1
and w9
for the second argument. We sum them together and increment the Stack Pointer before returning.
One last thing is missing: what does it look like to call this function?
The next snippet is the entire module with definition and a call to the function my_uncool_sum
:
int my_uncool_sum (int b, int c) {
return b + c + 1;
}
int uncool_sum_usage() {
volatile register int busy = 0xFFF;
return busy + my_uncool_sum(1, 2);
}
Note that I'm being annoying on uncool_sum_usage
and asking 4095,0xFFF
in hex, to be kept around in a register. Here is the disassembly of uncool_sum_usage
:
[0x24] <+0>: sub sp, sp, #0x20
[0x28] <+4>: stp x29, x30, [sp, #0x10]
[0x2c] <+8>: add x29, sp, #0x10
[0x30] <+12>: mov w8, #0xfff
[0x34] <+16>: stur w8, [x29, #-0x4]
[0x38] <+20>: ldur w8, [x29, #-0x4]
[0x3c] <+24>: str w8, [sp, #0x8]
[0x40] <+28>: mov w0, #0x1
[0x44] <+32>: mov w1, #0x2
[0x48] <+36>: bl 0x48 ; <+36> at main.c:7:17
[0x4c] <+40>: ldr w8, [sp, #0x8]
[0x50] <+44>: add w0, w8, w0
[0x54] <+48>: ldp x29, x30, [sp, #0x10]
[0x58] <+52>: add sp, sp, #0x20
[0x5c] <+56>: ret
As the fourth instruction tells us, 0xfff
was right there on w8
. It is followed a stur
(store register) instruction. That stur
is using x29 - 0x4
. x29
is another general-purpose register and 0x4
, once again, is just enough space for an int<sup>2</sup>. Note that x29
was recently seen with the Stack Pointer on the third instruction add x29, sp, #0x10
<sup>3</sup>. After calling the function, we need 0xFFF
back in order to return, and we get it using the Stack Pointer yet again ldr w8, [sp, #0x8]
.
0xFFF
was saved and restored even though we made a function call that could've altered our registers. Calling conventions are crucial to have this behavior be consistent. It might help you out to wonder how is the stack significant for function calling? How was it used during this example and what's the relation that it has with calling conventions? How can we make sure we're retaining previous information when jumping to code that might mess however it wants with registers?
Last note: in case you wish to replicate in another architecture (maybe using a different convention?), code was compiled with -stc=c99 -O0 -g
. lldb
was used to get the disassembly. If the object file is named a.out
(the default), you can get the disassembly by running lldb a.out
and send disassemble -n uncool_sum_usage
to get the disassembly for the function.
<sup>1</sup> In this architecture, addresses are 64 bit wide. Note that the registers in use are prefixed with w
, e.g., W0
. This means they are being using in a 32 bit wide mode, so the top 32 bits are being ignored. Because we're using the Stack Pointer to reference addresses rather than values, we need enough space for two addresses. A 64 bit address is the same as 8 bytes. One address for each means we need 16 bytes, or, 0x10
in hex.
<sup>2</sup> This time, we're talking about the size of the value rather than the address. The stur
instruction here is using the value contained within the busy
variable, namely, 0xFFF
.
<sup>3</sup> Not completely related to the problem at hand, but I figured it would be useful to showcase what's happening here as well. Without giving out too much, the second instruction can be seen as a store for x29
and x30
. The following instruction, add x29, sp, 0x10
is similar to the one we've seen before: x29
is holding the address SP + 10
(higher address than SP
). The sequence starting with mov w8, #0xfff
and ending with ldur w8, [x29, #-0x4]
is probably a consequence of disabling optimization. The end result is that w8
will be holing the value 0xFFF
. It only matters for the next instruction, str w8, [sp, #0x8]
, which will keep that value safe until we restore it again with w8, [sp, #0x8]
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论