Printing strings and characters as hexadecimal in Go

huangapple go评论102阅读模式
英文:

Printing strings and characters as hexadecimal in Go

问题

为什么十六进制格式中的西里尔字母字符串与十六进制格式中的西里尔字符不同?

str := "Э"
fmt.Printf("%x\n", str)
//结果为 d0ad

str := 'Э'
fmt.Printf("%x\n", str)
//结果为 42d
英文:

Why cyrillic strings in hexadecimal format differ from cyrillic chars in hexadecimal format?

str  := "Э"
fmt.Printf("%x\n", str)
//result d0ad
    
str  := 'Э'
fmt.Printf("%x\n", str)
//result 42d

答案1

得分: 1

打印一个字符串的十六进制表示会打印出其字节的十六进制表示,而打印一个rune的十六进制表示会打印出它所代表的数字的十六进制表示(rune是int32的别名)。在Go语言中,字符串保存的是文本的UTF-8编码字节序列。在UTF-8表示中,具有数值代码大于127的字符(rune)具有多字节表示。

rune "Э" 在UTF-8中有多字节表示(即[208, 173]),它与32位整数1069 = 0x42d的多字节表示不同。在内存中,整数使用二进制补码表示。

推荐阅读的博文:Go语言中的字符串、字节、rune和字符

英文:

Printing the hexadecimal representation of a string prints the hex representation of its bytes, and printing the hexadecimal representation of a rune prints the hex representation of the number it is an alias to (rune is an alias to int32).

And strings in Go hold the UTF-8 encoded byte sequence of the text. In UTF-8 representation characters (runes) having a numeric code > 127 have multi-byte representation.

The rune Э has multi-byte representation in UTF-8 (being [208, 173]), and it is not the same as the multi-byte representation of the 32-bit integer 1069 = 0x42d. Integers are represented using two's complement in memory.

Recommended blog post: Strings, bytes, runes and characters in Go

huangapple
  • 本文由 发表于 2022年5月13日 15:20:12
  • 转载请务必保留本文链接:https://go.coder-hub.com/72225772.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定