英文:
Overhead of ASM-function-call in go
问题
我目前正在使用Go语言进行一些实验,主要涉及到它的汇编语言、浮点运算性能(float32
)以及纳秒级别的优化。我对一个简单函数调用的开销感到有些困惑:
func BenchmarkEmpty(b *testing.B) {
for i := 0; i < b.N; i++ {
}
}
func BenchmarkNop(b *testing.B) {
for i := 0; i < b.N; i++ {
doNop()
}
}
doNop
函数的实现如下:
TEXT ·doNop(SB),0,$0-0
RET
运行结果(go test -bench .
)如下:
BenchmarkEmpty 2000000000 0.30 ns/op
BenchmarkNop 2000000000 1.73 ns/op
我对汇编语言和Go的内部机制不太熟悉。在Go编译器/链接器中,是否有可能内联一个在汇编中定义的函数?我是否可以通过某种方式给链接器一些提示?对于一些简单的函数,比如“将两个R3向量相加”,这样做可能会消耗所有可能的性能提升。
(go 1.4.2,amd64)
英文:
I currently play around with go, it's assembly, performance of floating point operations (float32
) and optimizations in the nano-seconds-scale. I was a bit confused by the overhead of a simple function call:
func BenchmarkEmpty(b *testing.B) {
for i := 0; i < b.N; i++ {
}
}
func BenchmarkNop(b *testing.B) {
for i := 0; i < b.N; i++ {
doNop()
}
}
The implementation of doNop:
TEXT ·doNop(SB),0,$0-0
RET
The result (go test -bench .
):
BenchmarkEmpty 2000000000 0.30 ns/op
BenchmarkNop 2000000000 1.73 ns/op
Im not used to assembly and/ or the internals of go. It is possible fo the go compiler/ linker to inline a function defined in assembly? Can I give the linker a hint somehow? For some simple functions like 'add two R3-vectors' this eats up all possible performance gain.
(go 1.4.2, amd64)
答案1
得分: 1
汇编函数不会被内联。以下是你可以尝试的三件事:
-
将你的循环移到汇编中。例如,使用以下函数:
func Sum(xs []int64) int64
你可以这样做:
#include "textflag.h" TEXT ·Sum(SB),NOSPLIT,$0-24 MOVQ xs+0(FP),DI MOVQ xs+8(FP),SI MOVQ $0,CX MOVQ $0,AX L1: CMPQ AX,SI // i < len(xs) JGE Z1 LEAQ (DI)(AX*8),BX // BX = &xs[i] MOVQ (BX),BX // BX = *BX ADDQ BX,CX // CX += BX INCQ AX // i++ JMP L1 Z1: MOVQ CX,ret+24(FP) RET
如果你查看标准库,你会看到这方面的例子。
-
在C语言中编写一些代码,利用它对内嵌汇编或内置函数的支持,并使用cgo从Go中调用它。
-
使用gccgo来完成与第2点相同的事情,只是你可以直接这样做:
//extern open func c_open(name *byte, mode int, perm int) int
参考链接:https://golang.org/doc/install/gccgo#Function_names
英文:
Assembly functions are not inlined. Here are 3 things you could try:
-
Move your loop into assembly. For example with this function:
func Sum(xs []int64) int64
You can do this:
#include "textflag.h" TEXT ·Sum(SB),NOSPLIT,$0-24 MOVQ xs+0(FP),DI MOVQ xs+8(FP),SI MOVQ $0,CX MOVQ $0,AX L1: CMPQ AX,SI // i < len(xs) JGE Z1 LEAQ (DI)(AX*8),BX // BX = &xs[i] MOVQ (BX),BX // BX = *BX ADDQ BX,CX // CX += BX INCQ AX // i++ JMP L1 Z1: MOVQ CX,ret+24(FP) RET
If you look in the standard libraries you will see examples of this.
-
Write some of your code in c, leverage the support it has for intrinsics or inline assembly, and use cgo to call it from go.
-
Use gccgo to do the same thing as #2, except you can do it directly:
//extern open func c_open(name *byte, mode int, perm int) int
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论