英文:
Got error "invalid character 'ï' looking for beginning of value” from json.Unmarshal
问题
我使用Golang的HTTP请求获取JSON输出如下所示。
我正在尝试访问的Web服务是Microsoft Translator https://msdn.microsoft.com/en-us/library/dn876735.aspx
// TransformTextResponse的数据结构
type TransformTextResponse struct {
ErrorCondition int `json:"ec"` // 表示错误条件的正数
ErrorDescriptive string `json:"em"` // 描述性错误消息
Sentence string `json:"sentence"` // 转换后的文本
}
// 一些代码...
body, err := ioutil.ReadAll(response.Body)
defer response.Body.Close()
if err != nil {
return "", tracerr.Wrap(err)
}
transTransform := TransformTextResponse{}
err = json.Unmarshal(body, &transTransform)
if err != nil {
return "", tracerr.Wrap(err)
}
我收到了一个错误消息:invalid character '239' looking for beginning of value
。
所以,我尝试将body
作为字符串打印出来fmt.Println(string(body))
,结果如下所示:
{"ec":0,"em":"OK","sentence":"This is too strange i just want to go home soon"}
看起来数据没有任何问题,所以我尝试使用json.Marshal
创建相同的值:
transTransform := TransformTextResponse{}
transTransform.ErrorCondition = 0
transTransform.ErrorDescriptive = "OK"
transTransform.Sentence = "This is too strange i just want to go home soon"
jbody, _ := json.Marshal(transTransform)
我发现原始数据可能有问题,所以我尝试比较两个以[]byte
格式表示的数据。
来自response.Body
的数据:
[239 187 191 123 34 101 99 34 58 48 44 34 101 109 34 58 34 79 75 34 44 34 115 101 110 116 101 110 99 101 34 58 34 84 104 105 115 32 105 115 32 116 111 111 32 115 116 114 97 110 103 101 32 105 32 106 117 115 116 32 119 97 110 116 32 116 111 32 103 111 32 104 111 109 101 32 115 111 111 110 34 125]
来自json.Marshal
的数据:
[123 34 101 99 34 58 48 44 34 101 109 34 58 34 79 75 34 44 34 115 101 110 116 101 110 99 101 34 58 34 84 104 105 115 32 105 115 32 116 111 111 32 115 116 114 97 110 103 101 32 105 32 106 117 115 116 32 119 97 110 116 32 116 111 32 103 111 32 104 111 109 101 32 115 111 111 110 34 125]
有什么办法可以解析这个response.Body
并将其解组为数据结构吗?
英文:
I use a Golang HTTP request to get json output as follow.
The web service I am trying to access is Micrsoft Translator https://msdn.microsoft.com/en-us/library/dn876735.aspx
//Data struct of TransformTextResponse
type TransformTextResponse struct {
ErrorCondition int `json:"ec"` // A positive number representing an error condition
ErrorDescriptive string `json:"em"` // A descriptive error message
Sentence string `json:"sentence"` // transformed text
}
//some code ....
body, err := ioutil.ReadAll(response.Body)
defer response.Body.Close()
if err != nil {
return "", tracerr.Wrap(err)
}
transTransform = TransformTextResponse{}
err = json.Unmarshal(body, &transTransform)
if err != nil {
return "", tracerr.Wrap(err)
}
I got an error from invalid character 'ï' looking for beginning of value
So, I try to print the body
as string fmt.Println(string(body))
, it show:
{"ec":0,"em":"OK","sentence":"This is too strange i just want to go home soon"}
It seems the data doesn't have any problem, so I tried to create the same value by jason.Marshal
transTransform := TransformTextResponse{}
transTransform.ErrorCondition = 0
transTransform.ErrorDescriptive = "OK"
transTransform.Sentence = "This is too strange i just want to go home soon"
jbody, _ := json.Marshal(transTransform)
I found the original data might have problem, so I try to compare two data in []byte
format.
Data from response.Body
:
[239 187 191 123 34 101 99 34 58 48 44 34 101 109 34 58 34 79 75 34 44 34 115 101 110 116 101 110 99 101 34 58 34 84 104 105 115 32 105 115 32 116 111 111 32 115 116 114 97 110 103 101 32 105 32 106 117 115 116 32 119 97 110 116 32 116 111 32 103 111 32 104 111 109 101 32 115 111 111 110 34 125]
Data from json.Marshal
[123 34 101 99 34 58 48 44 34 101 109 34 58 34 79 75 34 44 34 115 101 110 116 101 110 99 101 34 58 34 84 104 105 115 32 105 115 32 116 111 111 32 115 116 114 97 110 103 101 32 105 32 106 117 115 116 32 119 97 110 116 32 116 111 32 103 111 32 104 111 109 101 32 115 111 111 110 34 125]
Any idea how I parse this response.Body
and Unmarshal it into data structure?
答案1
得分: 37
服务器正在向您发送一个带有字节顺序标记(BOM)的UTF-8文本字符串。BOM用于标识文本是UTF-8编码的,但在解码之前应将其删除。
可以使用以下代码行(使用包“bytes”)来完成:
body = bytes.TrimPrefix(body, []byte("\xef\xbb\xbf")) // 或 []byte{239, 187, 191}
附注:关于ï
的错误是因为将UTF-8 BOM解释为ISO-8859-1字符串会产生字符
。
英文:
The server is sending you a UTF-8 text string with a Byte Order Mark (BOM). The BOM identifies that the text is UTF-8 encoded, but it should be removed before decoding.
This can be done with the following line (using package "bytes"):
body = bytes.TrimPrefix(body, []byte("\xef\xbb\xbf")) // Or []byte{239, 187, 191}
PS. The error referring to ï
is because the UTF-8 BOM interpreted as an ISO-8859-1 string will produce the characters 
.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论