字符串比较的汇编实现

huangapple go评论76阅读模式
英文:

asm implementation of a string comparison

问题

在阅读使用Go编译器构建的简单程序的汇编输出时,我无法理解字符串比较的实现。

程序如下:

package main

import (
	"fmt"
	"os"
)

func main() {

	fmt.Print("Enter the value: ")

	var v string
	fmt.Fscanf(os.Stdin, "%v", &v)

	if v != "123456" {
		fmt.Println("exit")
		os.Exit(2)
	}
	fmt.Println("v=", v)
}

在下面的提取中,0x48bd940x48bd9d处的代码是做什么的?

$ objdump --disassemble=main.main ./z
...
  48bd85: 48 8b 54 24 38        mov    0x38(%rsp),%rdx
  48bd8a: 4c 8b 02              mov    (%rdx),%r8
  48bd8d: 48 83 7a 08 06        cmpq   $0x6,0x8(%rdx)
  48bd92: 75 12                 jne    48bda6 <main.main+0xe6>
  48bd94: 41 81 38 31 32 33 34  cmpl   $0x34333231,(%r8)
  48bd9b: 75 09                 jne    48bda6 <main.main+0xe6>
  48bd9d: 66 41 81 78 04 35 36  cmpw   $0x3635,0x4(%r8)
  48bda4: 74 49                 je     48bdef <main.main+0x12f>
...

0x48bd94处,它使用cmpl指令将$0x34333231(%r8)中的值进行比较。

0x48bd9d处,它使用cmpw指令将$0x36350x4(%r8)中的值进行比较。

这段代码的作用是检查输入的字符串是否等于"123456",如果不相等,则打印"exit"并退出程序。

英文:

while reading the assembly output of a simple program built with the Go compiler I could not make sense of the string comparison implementation.

The program is like

package main

import (
	&quot;fmt&quot;
	&quot;os&quot;
)

func main() {

	fmt.Print(&quot;Enter the value: &quot;)

	var v string
	fmt.Fscanf(os.Stdin, &quot;%v&quot;, &amp;v)

	if v != &quot;123456&quot; {
		fmt.Println(&quot;exit&quot;)
		os.Exit(2)
	}
	fmt.Println(&quot;v=&quot;, v)
}

In below extract, what does it do at 0x48bd94 and 0x48bd9d ?

$ objdump  --disassemble=main.main ./z
...
  48bd85:	48 8b 54 24 38       	mov    0x38(%rsp),%rdx
  48bd8a:	4c 8b 02             	mov    (%rdx),%r8
  48bd8d:	48 83 7a 08 06       	cmpq   $0x6,0x8(%rdx)              
  48bd92:	75 12                	jne    48bda6 &lt;main.main+0xe6&gt;          
  48bd94:	41 81 38 31 32 33 34 	cmpl   $0x34333231,(%r8)           
  48bd9b:	75 09                	jne    48bda6 &lt;main.main+0xe6&gt;
  48bd9d:	66 41 81 78 04 35 36 	cmpw   $0x3635,0x4(%r8)             
  48bda4:	74 49                	je     48bdef &lt;main.main+0x12f&gt;     
...

答案1

得分: 1

感谢评论,这确实有意义。

总结一下,这两个指令是字符串比较的一个更大的三步实现的一部分。

0x48bd8d处,它比较字符串的长度,其中$0x6123456的硬编码长度,0x8(%rdx)是扫描值的长度。

0x48bd9d0x48bd94处,它以两个步骤比较字符串,0x34333231转换为43210x363565,而不是我一开始以为的某个随机内存地址。

  48bd8d:   48 83 7a 08 06          cmpq   $0x6,0x8(%rdx)              
  48bd92:   75 12                   jne    48bda6 &lt;main.main+0xe6&gt;          
  48bd94:   41 81 38 31 32 33 34    cmpl   $0x34333231,(%r8)           
  48bd9b:   75 09                   jne    48bda6 &lt;main.main+0xe6&gt;
  48bd9d:   66 41 81 78 04 35 36    cmpw   $0x3635,0x4(%r8)             
  48bda4:   74 49                   je     48bdef &lt;main.main+0x12f&gt;   

是的,Jester说得对。

正如kostix提到的,这是一种优化,对较长字符串的比较将产生类似于

  48bda0:	e8 5b 63 f7 ff       	call   402100 &lt;runtime.memequal&gt;
  • 感谢https://stackoverflow.com/questions/19748074/meaning-of-0x8rsp和https://stackoverflow.com/questions/54195834/how-to-inspect-slice-header,这很有意义。

对于将来的自己,https://cs.brown.edu/courses/cs033/docs/guides/x64_cheatsheet.pdf

英文:

thanks to the comments, it does make sense.

To summarize, those two instructions are part of a larger three steps implementation of the string comparison.

At 0x48bd8d, it compares the string length where $0x6 is the hardcoded length of 123456 and 0x8(%rdx) the length of the scanned value. *

At 0x48bd9d and 0x48bd94 it compares the string in two steps, 0x34333231 translates to 4321 and 0x3635 is 65, and not some random memory address i was thinking at first.

  48bd8d:   48 83 7a 08 06          cmpq   $0x6,0x8(%rdx)              
  48bd92:   75 12                   jne    48bda6 &lt;main.main+0xe6&gt;          
  48bd94:   41 81 38 31 32 33 34    cmpl   $0x34333231,(%r8)           
  48bd9b:   75 09                   jne    48bda6 &lt;main.main+0xe6&gt;
  48bd9d:   66 41 81 78 04 35 36    cmpw   $0x3635,0x4(%r8)             
  48bda4:   74 49                   je     48bdef &lt;main.main+0x12f&gt;   

And, yes, Jester got it right.

As kostix mentionned, this is an optimization, a comparison against a longer string will produce something like

  48bda0:	e8 5b 63 f7 ff       	call   402100 &lt;runtime.memequal&gt;

for my future self, https://cs.brown.edu/courses/cs033/docs/guides/x64_cheatsheet.pdf

huangapple
  • 本文由 发表于 2022年5月25日 20:39:14
  • 转载请务必保留本文链接:https://go.coder-hub.com/72377809.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定