英文:
Why are string and []bytes treated differently when unmarshaling JSON?
问题
从阅读文档的理解来看,string
本质上是一个不可变的[]byte
,可以在两者之间轻松转换。
然而,当从JSON解组时,这似乎并不正确。请看下面的示例程序:
package main
import (
"encoding/json"
"fmt"
)
type STHRaw struct {
Hash []byte `json:"hash"`
}
type STHString struct {
Hash string `json:"hash"`
}
func main() {
bytes := []byte(`{"hash": "nuyHN9wx4lZL2L3Ir3dhZpmggTQEIHEZcC3DUNCtQsk="}`)
stringHead := new(STHString)
if err := json.Unmarshal(bytes, &stringHead); err != nil {
return
}
rawHead := new(STHRaw)
if err := json.Unmarshal(bytes, &rawHead); err != nil {
return
}
fmt.Printf("String:\t\t%x\n", stringHead.Hash)
fmt.Printf("Raw:\t\t%x\n", rawHead.Hash)
fmt.Printf("Raw to string:\t%x\n", string(rawHead.Hash[:]))
}
这将产生以下输出:
String: 6e7579484e397778346c5a4c324c3349723364685a706d67675451454948455a63433344554e437451736b3d
Raw: 9eec8737dc31e2564bd8bdc8af77616699a0813404207119702dc350d0ad42c9
Raw to string: 9eec8737dc31e2564bd8bdc8af77616699a0813404207119702dc350d0ad42c9
相反,我原本期望每次都收到相同的值。
这两者有什么区别?
英文:
My understanding from reading the documentation was that string
is essentially an immutable []byte
and that one can easily convert between the two.
However when unmarshaling from JSON this doesn't seem to be true. Take the following example program:
package main
import (
"encoding/json"
"fmt"
)
type STHRaw struct {
Hash []byte `json:"hash"`
}
type STHString struct {
Hash string `json:"hash"`
}
func main() {
bytes := []byte(`{"hash": "nuyHN9wx4lZL2L3Ir3dhZpmggTQEIHEZcC3DUNCtQsk="}`)
stringHead := new(STHString)
if err := json.Unmarshal(bytes, &stringHead); err != nil {
return
}
rawHead := new(STHRaw)
if err := json.Unmarshal(bytes, &rawHead); err != nil {
return
}
fmt.Printf("String:\t\t%x\n", stringHead.Hash)
fmt.Printf("Raw:\t\t%x\n", rawHead.Hash)
fmt.Printf("Raw to string:\t%x\n", string(rawHead.Hash[:]))
}
This gives the following output:
String: 6e7579484e397778346c5a4c324c3349723364685a706d67675451454948455a63433344554e437451736b3d
Raw: 9eec8737dc31e2564bd8bdc8af77616699a0813404207119702dc350d0ad42c9
Raw to string: 9eec8737dc31e2564bd8bdc8af77616699a0813404207119702dc350d0ad42c9
Instead I would have expected to receive the same value each time.
What is the difference?
答案1
得分: 4
encoding/json包的设计者决定应用程序必须在string
值中提供有效的UTF-8文本,并且应用程序可以在[]byte
值中放置任意字节序列。该包将[]byte
值进行base64编码,以确保生成的字符串是有效的UTF-8。
[]byte
值的编码在Marshal函数文档中有描述。
这个决定并不是由Go语言的设计所决定的。string
类型可以包含任意字节序列。[]byte
类型可以包含有效的UTF-8文本。
设计者本可以在字段标签中使用一个标志来指示应该对string
或[]byte
值进行编码,并指定使用哪个编码器,但他们没有这样做。
英文:
The designers of the encoding/json package made the decision that applications must provide valid UTF-8 text in string
values and that applications can put arbitrary byte sequences in []byte
values. The package base64 encodes []byte
values to ensure that the resulting string is valid UTF-8.
The encoding of []byte
values is described in the Marshal function documentation.
This decision was not dictated by the design of the Go language. The string
type can contain arbitrary byte sequences. The []byte
type can contain valid UTF-8 text.
The designers could have used a flag in the field tag to indicate that a string
or []byte
value should be encoded and which encoder to use, but that's not what they did.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论