关于Golang汇编的一些困惑

huangapple go评论72阅读模式
英文:

Some confusion about golang assembly

问题

我的Golang源代码如下所示。

package main

func add(x, y int) int {
	return x + y
}

func main() {
	_ = add(1, 2)
}

我使用go tool compile -N -l -S main.go > file1.s命令得到的汇编代码如下(其中的一部分)。

;file1.s
"".main STEXT size=54 args=0x0 locals=0x18 funcid=0x0
	0x0000 00000 (main.go:7)	TEXT "".main(SB), ABIInternal, $24-0
	0x0000 00000 (main.go:7)	CMPQ SP, 16(R14)
	0x0004 00004 (main.go:7)	PCDATA $0, $-2
	0x0004 00004 (main.go:7)	JLS 47
    ……
	0x002f 00047 (main.go:7)	CALL runtime.morestack_noctxt(SB)
	0x0034 00052 (main.go:7)	PCDATA $0, $-1
	0x0034 00052 (main.go:7)	JMP 0

我使用go tool compile -N -l main.gogo tool objdump -S -gnu main.o > file2.s命令得到的汇编代码如下(其中的一部分)。

;file2.s
TEXT "".main(SB) gofile..D:/code/Test/025_go/007_ass/main.go
func main() {
  0x5b6			493b6610		CMPQ 0x10(R14), SP      // cmp 0x10(%r14),%rsp	
  0x5ba			7629			JBE 0x5e5               // jbe 0x5e5
  ……
  func main() {
  0x5e5			e800000000		CALL 0x5ea              // callq 0x5ea	[1:5]R_CALL:runtime.morestack_noctxt	
  0x5ea			ebca			JMP "".main(SB)         // jmp 0x5b6	

我的问题是:

  1. 为什么file1.s和file2.s中的CMPQ指令的源操作数和目的操作数相反,如CMPQ SP, 16(R14)CMPQ 0x10(R14), SP
  2. 对于上述两段代码,我的理解是:当SP <= R14 + 16时,调用runtime.morestack_noctxt来扩展栈。但我不明白的是:为什么是SP <= R14 + 16,背后的逻辑是什么?R14是链接寄存器吗?
  3. file2.s中的代码是一个死循环吗?为什么会这样?为什么file1.s中的代码不是一个死循环?
  4. file2.s中的[1:5][1:5]R_CALL:runtime.morestack_noctxt中表示什么意思?

我对C++/Golang以及汇编有基本的了解,并且我理解程序的内存布局,但是我对上述问题感到非常困惑。有人可以帮助我吗?或者我应该阅读哪些资料?

谢谢所有帮助我的人。

英文:

My Golang source code is as follows.

package main

func add(x, y int) int {
	return x + y
}

func main() {
	_ = add(1, 2)
}

The assembly code I obtained using go tool compile -N -l -S main.go &gt; file1.s is as follows(part of it).

;file1.s
&quot;&quot;.main STEXT size=54 args=0x0 locals=0x18 funcid=0x0
	0x0000 00000 (main.go:7)	TEXT	&quot;&quot;.main(SB), ABIInternal, $24-0
	0x0000 00000 (main.go:7)	CMPQ	SP, 16(R14)
	0x0004 00004 (main.go:7)	PCDATA	$0, $-2
	0x0004 00004 (main.go:7)	JLS	47
    ……
	0x002f 00047 (main.go:7)	CALL	runtime.morestack_noctxt(SB)
	0x0034 00052 (main.go:7)	PCDATA	$0, $-1
	0x0034 00052 (main.go:7)	JMP	0

And the assembly code I obtained using go tool compile -N -l main.go and go tool objdump -S -gnu main.o &gt; file2.s is as follows(part of it).

;file2.s
TEXT &quot;&quot;.main(SB) gofile..D:/code/Test/025_go/007_ass/main.go
func main() {
  0x5b6			493b6610		CMPQ 0x10(R14), SP      // cmp 0x10(%r14),%rsp	
  0x5ba			7629			JBE 0x5e5               // jbe 0x5e5
  ……
  func main() {
  0x5e5			e800000000		CALL 0x5ea              // callq 0x5ea	[1:5]R_CALL:runtime.morestack_noctxt	
  0x5ea			ebca			JMP &quot;&quot;.main(SB)         // jmp 0x5b6	

My questions are:

  1. Why are the source and destination of the CMPQ instructions in file1.s and file2.s opposite, as in CMPQ SP, 16(R14) vs CMPQ 0x10(R14), SP?
  2. For the above two code, my understanding is: when SP &lt;= R14 + 16, call runtime.morestack_noctxt to extend stack. But what I don't understand is: why is SP &lt;= R14 + 16, what is the logic behind? R14 is link register?
  3. Is the code in file2.s a dead loop? Why is it so? Why is the code in file1.s not a dead loop?
  4. What is the meaning of [1:5] in [1:5]R_CALL:runtime.morestack_noctxt in file2.s?

I have a basic knowledge of c++/golang as well as assembly, and I understand the memory layout of programs, but I am really confused about the above questions. Can anyone help me, or what material should I read?

Thank you to everyone who helps me.

答案1

得分: 7

> 为什么file1.s和file2.s中CMPQ指令的源操作数和目标操作数是相反的,比如CMPQ SP, 16(R14)和CMPQ 0x10(R14), SP?

这很可能是反汇编器的一个错误,我建议你将其报告给Go项目。Go汇编器几乎所有指令都使用AT&T操作数顺序。但是,CMP系列指令是一个重要的例外,为了更容易使用,它采用了Intel的操作数顺序(即CMPQ foo, bar; JGT baz表示如果foo > bar,则跳转到baz)。

> 对于上述两段代码,我的理解是:当SP <= R14 + 16时,调用runtime.morestack_noctxt来扩展栈。但我不明白的是:为什么SP <= R14 + 16,背后的逻辑是什么?R14是链接寄存器吗?

R14保存着指向当前活动的Go协程对应的g结构体的指针,而0x10(R14)保存着栈的限制stackguard0。详细信息请参考user1856856的回答。这是根据寄存器ABI提案的最新发展。stackguard0是线程在必须向运行时请求更多栈之前可以使用的最低栈地址。

> file2.s中的代码是一个死循环吗?为什么会这样?为什么file1.s中的代码不是死循环?

不是的。当runtime.morestack_noctxt返回时,它会将R14更改为新的栈限制,因此比较将会成功。如果比较不成功,那么会再次分配更多的栈,直到成功为止。这意味着通常不会出现无限循环。

> [1:5]在[1:5]R_CALL:runtime.morestack_noctxt中的含义是什么?

这个注释提示存在一个重定位,表示链接器在链接时需要修补runtime.morestack_noctxt的地址。你可以看到指令e800000000中的函数地址都是零,所以调用不会到达有用的位置。只有当链接器解析重定位时,这种情况才会改变。

英文:

> Why are the source and destination of the CMPQ instructions in file1.s and file2.s opposite, as in CMPQ SP, 16(R14) vs CMPQ 0x10(R14), SP?

This is likely a bug in the disassembler which I encourage you to file with the Go project. The Go assembler has AT&T operand order for almost all instructions. The CMP family of instructions is a major exception which for easier use has the Intel operand order (i.e. CMPQ foo, bar; JGT baz jumps to baz if foo &gt; bar).

> For the above two code, my understanding is: when SP <= R14 + 16, call runtime.morestack_noctxt to extend stack. But what I don't understand is: why is SP <= R14 + 16, what is the logic behind? R14 is link register?

R14 holds a pointer to the g structure corresponding to the currently active Go routine and 0x10(R14) holds the stack limit stackguard0. See user1856856's answer for details. This is a new development following the register ABI proposal. stackguard0 lowest stack address the thread can use before it has to ask the runtime for more stack.

> Is the code in file2.s a dead loop? Why is it so? Why is the code in file1.s not a dead loop?

No. When runtime.morestack_noctxt returns, it has changed R14 to the new stack limit, hence the comparison will succeed. It is possible that it will not succeed in which case once again more stack is allocated until it does. This means it is normally not an endless loop.

> What is the meaning of [1:5] in [1:5]R_CALL:runtime.morestack_noctxt in file2.s?

This comments hints on the presence of a relocation, indicating that the linker will have to patch in the address of runtime.morestack_noctxt at link time. You can see that the function address in the instruction e800000000 is all zeroes, so as is the call doesn't go anywhere useful. This only changes when the linker resolves the relocation.

答案2

得分: 1

在amd64系统中,R14是保存当前g的寄存器,你可以查看以下函数:

// Append code to p to check for stack split.
// Appends to (does not overwrite) p.
// Assumes g is in rg.
// Returns last new instruction and G register.
func stacksplit(ctxt *obj.Link, cursym *obj.LSym, p *obj.Prog, newprog obj.ProgAlloc, framesize int32, textarg int32) (*obj.Prog, int16) {
    // emit...
    // Load G register
    var rg int16
    p, rg = loadG(ctxt, cursym, p, newprog)

    var q1 *obj.Prog
    if framesize <= objabi.StackSmall {
        // small stack: SP <= stackguard
        // CMPQ SP, stackguard
        p = obj.Appendp(p, newprog)

        p.As = cmp
        p.From.Type = obj.TYPE_REG
        p.From.Reg = REG_SP
        p.To.Type = obj.TYPE_MEM
        p.To.Reg = rg
        p.To.Offset = 2 * int64(ctxt.Arch.PtrSize) // G.stackguard0
        if cursym.CFunc() {
            p.To.Offset = 3 * int64(ctxt.Arch.PtrSize) // G.stackguard1
        }
}

以及loadG函数:

func loadG(ctxt *obj.Link, cursym *obj.LSym, p *obj.Prog, newprog obj.ProgAlloc) (*obj.Prog, int16) {
    if ctxt.Arch.Family == sys.AMD64 && cursym.ABI() == obj.ABIInternal {
        // Use the G register directly in ABIInternal
        return p, REGG
    }

    var regg int16 = REG_CX
    if ctxt.Arch.Family == sys.AMD64 {
        regg = REGG // == REG_R14
    }

在src/cmd/internal/obj/x86/a.out.go文件中,有以下定义:

REGG = REG_R14 // g register in ABIInternal

g结构体定义如下:

type g struct {
    // Stack parameters.
    // stack describes the actual stack memory: [stack.lo, stack.hi).
    // stackguard0 is the stack pointer compared in the Go stack growth prologue.
    // It is stack.lo+StackGuard normally, but can be StackPreempt to trigger a preemption.
    // stackguard1 is the stack pointer compared in the C stack growth prologue.
    // It is stack.lo+StackGuard on g0 and gsignal stacks.
    // It is ~0 on other goroutine stacks, to trigger a call to morestackc (and crash).
    stack       stack   // offset known to runtime/cgo
    stackguard0 uintptr // offset known to liblink
    stackguard1 uintptr // offset known to liblink
}

stack结构体定义如下:

// Stack describes a Go execution stack.
// The bounds of the stack are exactly [lo, hi),
// with no implicit data structures on either side.
type stack struct {
    lo uintptr
    hi uintptr
}

所以我认为R14寄存器偏移量为16的值就是当前g的stackguard0的值。

英文:

for question 2,in amd64 system, R14 is the register that holds the current g,you can check the function

// Append code to p to check for stack split.
// Appends to (does not overwrite) p.
// Assumes g is in rg.
// Returns last new instruction and G register.
func stacksplit(ctxt *obj.Link, cursym *obj.LSym, p *obj.Prog, newprog obj.ProgAlloc, framesize int32, textarg int32) (*obj.Prog, int16) {
// emit...
// Load G register
	var rg int16
	p, rg = loadG(ctxt, cursym, p, newprog)

	var q1 *obj.Prog
	if framesize &lt;= objabi.StackSmall {
		// small stack: SP &lt;= stackguard
		//	CMPQ SP, stackguard
		p = obj.Appendp(p, newprog)

		p.As = cmp
		p.From.Type = obj.TYPE_REG
		p.From.Reg = REG_SP
		p.To.Type = obj.TYPE_MEM
		p.To.Reg = rg
		p.To.Offset = 2 * int64(ctxt.Arch.PtrSize) // G.stackguard0
		if cursym.CFunc() {
			p.To.Offset = 3 * int64(ctxt.Arch.PtrSize) // G.stackguard1
		}

and the loadG

func loadG(ctxt *obj.Link, cursym *obj.LSym, p *obj.Prog, newprog obj.ProgAlloc) (*obj.Prog, int16) {
	if ctxt.Arch.Family == sys.AMD64 &amp;&amp; cursym.ABI() == obj.ABIInternal {
		// Use the G register directly in ABIInternal
		return p, REGG
	}

	var regg int16 = REG_CX
	if ctxt.Arch.Family == sys.AMD64 {
		regg = REGG // == REG_R14
	}

and the file in src/cmd/internal/obj/x86/a.out.go

REGG         = REG_R14     // g register in ABIInternal

the g structure is

type g struct {
	// Stack parameters.
	// stack describes the actual stack memory: [stack.lo, stack.hi).
	// stackguard0 is the stack pointer compared in the Go stack growth prologue.
	// It is stack.lo+StackGuard normally, but can be StackPreempt to trigger a preemption.
	// stackguard1 is the stack pointer compared in the C stack growth prologue.
	// It is stack.lo+StackGuard on g0 and gsignal stacks.
	// It is ~0 on other goroutine stacks, to trigger a call to morestackc (and crash).
	stack       stack   // offset known to runtime/cgo
	stackguard0 uintptr // offset known to liblink
	stackguard1 uintptr // offset known to liblink
}

and the stack structure is

// Stack describes a Go execution stack.
// The bounds of the stack are exactly [lo, hi),
// with no implicit data structures on either side.
type stack struct {
	lo uintptr
	hi uintptr
}

so i think an offset(16) to R14 is the value of current g's stackguard0

huangapple
  • 本文由 发表于 2022年5月12日 22:29:37
  • 转载请务必保留本文链接:https://go.coder-hub.com/72217460.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定