Golang:数组的索引效率

huangapple go评论120阅读模式
英文:

golang: index efficiency of array

问题

这是一个简单的程序。

测试环境:Debian 8,Go 1.4.2

union.go:

  1. package main
  2. import "fmt"
  3. type A struct {
  4. t int32
  5. u int64
  6. }
  7. func test() (total int64) {
  8. a := [...]A{{1, 100}, {2, 3}}
  9. for i := 0; i < 5000000000; i++ {
  10. p := &a[i%2]
  11. total += p.u
  12. }
  13. return
  14. }
  15. func main() {
  16. total := test()
  17. fmt.Println(total)
  18. }

union.c:

  1. #include <stdio.h>
  2. struct A {
  3. int t;
  4. long u;
  5. };
  6. long test()
  7. {
  8. struct A a[2];
  9. a[0].t = 1;
  10. a[0].u = 100;
  11. a[1].t = 2;
  12. a[1].u = 3;
  13. long total = 0;
  14. long i;
  15. for (i = 0; i < 5000000000; i++) {
  16. struct A* p = &a[i % 2];
  17. total += p->u;
  18. }
  19. return total;
  20. }
  21. int main()
  22. {
  23. long total = test();
  24. printf("%ld\n", total);
  25. }

结果比较:

Go:

  1. 257500000000
  2. real 0m9.167s
  3. user 0m9.196s
  4. sys 0m0.012s

C:

  1. 257500000000
  2. real 0m3.585s
  3. user 0m3.560s
  4. sys 0m0.008s

看起来Go编译生成了很多奇怪的汇编代码(你可以使用objdump -D来查看)。

例如,为什么movabs $0x12a05f200,%rbp出现了两次?

  1. 400c60: 31 c0 xor %eax,%eax
  2. 400c62: 48 bd 00 f2 05 2a 01 movabs $0x12a05f200,%rbp
  3. 400c69: 00 00 00
  4. 400c6c: 48 39 e8 cmp %rbp,%rax
  5. 400c6f: 7d 46 jge 400cb7 <main.test+0xb7>
  6. 400c71: 48 89 c1 mov %rax,%rcx
  7. 400c74: 48 c1 f9 3f sar $0x3f,%rcx
  8. 400c78: 48 89 c3 mov %rax,%rbx
  9. 400c7b: 48 29 cb sub %rcx,%rbx
  10. 400c7e: 48 83 e3 01 and $0x1,%rbx
  11. 400c82: 48 01 cb add %rcx,%rbx
  12. 400c85: 48 8d 2c 24 lea (%rsp),%rbp
  13. 400c89: 48 83 fb 02 cmp $0x2,%rbx
  14. 400c8d: 73 2d jae 400cbc <main.test+0xbc>
  15. 400c8f: 48 6b db 10 imul $0x10,%rbx,%rbx
  16. 400c93: 48 01 dd add %rbx,%rbp
  17. 400c96: 48 8b 5d 08 mov 0x8(%rbp),%rbx
  18. 400c9a: 48 01 f3 add %rsi,%rbx
  19. 400c9d: 48 89 de mov %rbx,%rsi
  20. 400ca0: 48 89 5c 24 28 mov %rbx,0x28(%rsp)
  21. 400ca5: 48 ff c0 inc %rax
  22. 400ca8: 48 bd 00 f2 05 2a 01 movabs $0x12a05f200,%rbp
  23. 400caf: 00 00 00
  24. 400cb2: 48 39 e8 cmp %rbp,%rax
  25. 400cb5: 7c ba jl 400c71 <main.test+0x71>
  26. 400cb7: 48 83 c4 20 add $0x20,%rsp
  27. 400cbb: c3 retq
  28. 400cbc: e8 6f e0 00 00 callq 40ed30 <runtime.panicindex>
  29. 400cc1: 0f 0b ud2
  30. ...

而C的汇编代码更加简洁:

  1. 0000000000400570 <test>:
  2. 400570: 48 c7 44 24 e0 64 00 movq $0x64,-0x20(%rsp)
  3. 400577: 00 00
  4. 400579: 48 c7 44 24 f0 03 00 movq $0x3,-0x10(%rsp)
  5. 400580: 00 00
  6. 400582: b9 64 00 00 00 mov $0x64,%ecx
  7. 400587: 31 d2 xor %edx,%edx
  8. 400589: 31 c0 xor %eax,%eax
  9. 40058b: 48 be 00 f2 05 2a 01 movabs $0x12a05f200,%rsi
  10. 400592: 00 00 00
  11. 400595: eb 18 jmp 4005af <test+0x3f>
  12. 400597: 66 0f 1f 84 00 00 00 nopw 0x0(%rax,%rax,1)
  13. 40059e: 00 00
  14. 4005a0: 48 89 d1 mov %rdx,%rcx
  15. 4005a3: 83 e1 01 and $0x1,%ecx
  16. 4005a6: 48 c1 e1 04 shl $0x4,%rcx
  17. 4005aa: 48 8b 4c 0c e0 mov -0x20(%rsp,%rcx,1),%rcx
  18. 4005af: 48 83 c2 01 add $0x1,%rdx
  19. 4005b3: 48 01 c8 add %rcx,%rax
  20. 4005b6: 48 39 f2 cmp %rsi,%rdx
  21. 4005b9: 75 e5 jne 4005a0 <test+0x30>
  22. 4005bb: f3 c3 repz retq
  23. 4005bd: 0f 1f 00 nopl (%rax)

有人能解释一下吗?谢谢!

英文:

It's a simple program.
test environment: debian 8, go 1.4.2

union.go:

  1. package main
  2. import &quot;fmt&quot;
  3. type A struct {
  4. t int32
  5. u int64
  6. }
  7. func test() (total int64) {
  8. a := [...]A{{1, 100}, {2, 3}}
  9. for i := 0; i &lt; 5000000000; i++ {
  10. p := &amp;a[i%2]
  11. total += p.u
  12. }
  13. return
  14. }
  15. func main() {
  16. total := test()
  17. fmt.Println(total)
  18. }

union.c:

  1. #include &lt;stdio.h&gt;
  2. struct A {
  3. int t;
  4. long u;
  5. };
  6. long test()
  7. {
  8. struct A a[2];
  9. a[0].t = 1;
  10. a[0].u = 100;
  11. a[1].t = 2;
  12. a[1].u = 3;
  13. long total = 0;
  14. long i;
  15. for (i = 0; i &lt; 5000000000; i++) {
  16. struct A* p = &amp;a[i % 2];
  17. total += p-&gt;u;
  18. }
  19. return total;
  20. }
  21. int main()
  22. {
  23. long total = test();
  24. printf(&quot;%ld\n&quot;, total);
  25. }

result compare:

go:

  1. 257500000000
  2. real 0m9.167s
  3. user 0m9.196s
  4. sys 0m0.012s

C:

  1. 257500000000
  2. real 0m3.585s
  3. user 0m3.560s
  4. sys 0m0.008s

It seems that the go compiles lot of weird assembly codes (you could use objdump -D to check it).

For example, why movabs $0x12a05f200,%rbp appears twice?

  1. 400c60: 31 c0 xor %eax,%eax
  2. 400c62: 48 bd 00 f2 05 2a 01 movabs $0x12a05f200,%rbp
  3. 400c69: 00 00 00
  4. 400c6c: 48 39 e8 cmp %rbp,%rax
  5. 400c6f: 7d 46 jge 400cb7 &lt;main.test+0xb7&gt;
  6. 400c71: 48 89 c1 mov %rax,%rcx
  7. 400c74: 48 c1 f9 3f sar $0x3f,%rcx
  8. 400c78: 48 89 c3 mov %rax,%rbx
  9. 400c7b: 48 29 cb sub %rcx,%rbx
  10. 400c7e: 48 83 e3 01 and $0x1,%rbx
  11. 400c82: 48 01 cb add %rcx,%rbx
  12. 400c85: 48 8d 2c 24 lea (%rsp),%rbp
  13. 400c89: 48 83 fb 02 cmp $0x2,%rbx
  14. 400c8d: 73 2d jae 400cbc &lt;main.test+0xbc&gt;
  15. 400c8f: 48 6b db 10 imul $0x10,%rbx,%rbx
  16. 400c93: 48 01 dd add %rbx,%rbp
  17. 400c96: 48 8b 5d 08 mov 0x8(%rbp),%rbx
  18. 400c9a: 48 01 f3 add %rsi,%rbx
  19. 400c9d: 48 89 de mov %rbx,%rsi
  20. 400ca0: 48 89 5c 24 28 mov %rbx,0x28(%rsp)
  21. 400ca5: 48 ff c0 inc %rax
  22. 400ca8: 48 bd 00 f2 05 2a 01 movabs $0x12a05f200,%rbp
  23. 400caf: 00 00 00
  24. 400cb2: 48 39 e8 cmp %rbp,%rax
  25. 400cb5: 7c ba jl 400c71 &lt;main.test+0x71&gt;
  26. 400cb7: 48 83 c4 20 add $0x20,%rsp
  27. 400cbb: c3 retq
  28. 400cbc: e8 6f e0 00 00 callq 40ed30 &lt;runtime.panicindex&gt;
  29. 400cc1: 0f 0b ud2
  30. ...

while the C assembly is more clean:

  1. 0000000000400570 &lt;test&gt;:
  2. 400570: 48 c7 44 24 e0 64 00 movq $0x64,-0x20(%rsp)
  3. 400577: 00 00
  4. 400579: 48 c7 44 24 f0 03 00 movq $0x3,-0x10(%rsp)
  5. 400580: 00 00
  6. 400582: b9 64 00 00 00 mov $0x64,%ecx
  7. 400587: 31 d2 xor %edx,%edx
  8. 400589: 31 c0 xor %eax,%eax
  9. 40058b: 48 be 00 f2 05 2a 01 movabs $0x12a05f200,%rsi
  10. 400592: 00 00 00
  11. 400595: eb 18 jmp 4005af &lt;test+0x3f&gt;
  12. 400597: 66 0f 1f 84 00 00 00 nopw 0x0(%rax,%rax,1)
  13. 40059e: 00 00
  14. 4005a0: 48 89 d1 mov %rdx,%rcx
  15. 4005a3: 83 e1 01 and $0x1,%ecx
  16. 4005a6: 48 c1 e1 04 shl $0x4,%rcx
  17. 4005aa: 48 8b 4c 0c e0 mov -0x20(%rsp,%rcx,1),%rcx
  18. 4005af: 48 83 c2 01 add $0x1,%rdx
  19. 4005b3: 48 01 c8 add %rcx,%rax
  20. 4005b6: 48 39 f2 cmp %rsi,%rdx
  21. 4005b9: 75 e5 jne 4005a0 &lt;test+0x30&gt;
  22. 4005bb: f3 c3 repz retq
  23. 4005bd: 0f 1f 00 nopl (%rax)

Could somebody explain it? Thanks!

答案1

得分: 3

主要区别在于数组边界检查。在Go程序的反汇编转储中,可以看到以下内容:

  1. 400c89: 48 83 fb 02 cmp $0x2,%rbx
  2. 400c8d: 73 2d jae 400cbc <main.test+0xbc>
  3. ...
  4. 400cbc: e8 6f e0 00 00 callq 40ed30 <runtime.panicindex>
  5. 400cc1: 0f 0b ud2

因此,如果 %rbx 大于或等于2,则跳转到调用 runtime.panicindex 的位置。考虑到你正在使用大小为2的数组,这显然是边界检查。你可以认为编译器应该足够聪明,在可以静态确定索引范围的特定情况下跳过边界检查,但似乎它还不够聪明。

虽然在这个微基准测试中你看到了明显的性能差异,但值得考虑的是,这是否真正代表了你实际代码的情况。如果在循环中还有其他操作,差异可能不太明显。

虽然边界检查确实会带来一些开销,但在许多情况下,这比程序继续执行未定义行为的替代方案要好。

英文:

The main difference is the the array bounds checking. In the disassembly dump for the Go program, there is:

  1. 400c89: 48 83 fb 02 cmp $0x2,%rbx
  2. 400c8d: 73 2d jae 400cbc &lt;main.test+0xbc&gt;
  3. ...
  4. 400cbc: e8 6f e0 00 00 callq 40ed30 &lt;runtime.panicindex&gt;
  5. 400cc1: 0f 0b ud2

So if %rbx is greater than or equal to 2, then it jumps down to a call to runtime.panicindex. Given you're working with an array of size 2, that is clearly the bounds check. You could make the argument that the compiler should be smart enough to skip the bounds check in this particular case where the range of the index can be determined statically, but it seems that it isn't smart enough to do so yet.

While you're seeing a noticeable performance difference for this micro-benchmark, it might be worth considering whether this is actually representative of your actual code. If you're doing other stuff in your loop, the difference is likely to be less noticeable.

And while bounds checking does have a cost, in many cases it is better than the alternative of the program continuing with undefined behaviour.

huangapple
  • 本文由 发表于 2015年7月22日 14:40:56
  • 转载请务必保留本文链接:https://go.coder-hub.com/31555672.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定