从json.Unmarshal中得到错误信息:“无效字符’239’,寻找值的开头”。

huangapple go评论89阅读模式
英文:

Got error "invalid character 'ï' looking for beginning of value” from json.Unmarshal

问题

我使用Golang的HTTP请求获取JSON输出如下所示。
我正在尝试访问的Web服务是Microsoft Translator https://msdn.microsoft.com/en-us/library/dn876735.aspx

// TransformTextResponse的数据结构
type TransformTextResponse struct {
    ErrorCondition   int    `json:"ec"`       // 表示错误条件的正数
    ErrorDescriptive string `json:"em"`       // 描述性错误消息
    Sentence         string `json:"sentence"` // 转换后的文本
}

// 一些代码...
body, err := ioutil.ReadAll(response.Body)
defer response.Body.Close()
if err != nil {
    return "", tracerr.Wrap(err)
}

transTransform := TransformTextResponse{}
err = json.Unmarshal(body, &transTransform)
if err != nil {
    return "", tracerr.Wrap(err)
}

我收到了一个错误消息:invalid character '239' looking for beginning of value

所以,我尝试将body作为字符串打印出来fmt.Println(string(body)),结果如下所示:

{"ec":0,"em":"OK","sentence":"This is too strange i just want to go home soon"}

看起来数据没有任何问题,所以我尝试使用json.Marshal创建相同的值:

transTransform := TransformTextResponse{}
transTransform.ErrorCondition = 0
transTransform.ErrorDescriptive = "OK"
transTransform.Sentence = "This is too strange i just want to go home soon"
jbody, _ := json.Marshal(transTransform)

我发现原始数据可能有问题,所以我尝试比较两个以[]byte格式表示的数据。

来自response.Body的数据:

[239 187 191 123 34 101 99 34 58 48 44 34 101 109 34 58 34 79 75 34 44 34 115 101 110 116 101 110 99 101 34 58 34 84 104 105 115 32 105 115 32 116 111 111 32 115 116 114 97 110 103 101 32 105 32 106 117 115 116 32 119 97 110 116 32 116 111 32 103 111 32 104 111 109 101 32 115 111 111 110 34 125]

来自json.Marshal的数据:

[123 34 101 99 34 58 48 44 34 101 109 34 58 34 79 75 34 44 34 115 101 110 116 101 110 99 101 34 58 34 84 104 105 115 32 105 115 32 116 111 111 32 115 116 114 97 110 103 101 32 105 32 106 117 115 116 32 119 97 110 116 32 116 111 32 103 111 32 104 111 109 101 32 115 111 111 110 34 125]

有什么办法可以解析这个response.Body并将其解组为数据结构吗?

英文:

I use a Golang HTTP request to get json output as follow.
The web service I am trying to access is Micrsoft Translator https://msdn.microsoft.com/en-us/library/dn876735.aspx

//Data struct of TransformTextResponse
type TransformTextResponse struct {
    ErrorCondition   int    `json:"ec"`       // A positive number representing an error condition
	ErrorDescriptive string `json:"em"`       // A descriptive error message
	Sentence         string `json:"sentence"` // transformed text
}


//some code ....
body, err := ioutil.ReadAll(response.Body)
defer response.Body.Close()
if err != nil {
	return "", tracerr.Wrap(err)
}

transTransform = TransformTextResponse{}
err = json.Unmarshal(body, &transTransform)
if err != nil {
   return "", tracerr.Wrap(err)
}

I got an error from invalid character 'ï' looking for beginning of value

So, I try to print the body as string fmt.Println(string(body)), it show:

{"ec":0,"em":"OK","sentence":"This is too strange i just want to go home soon"}

It seems the data doesn't have any problem, so I tried to create the same value by jason.Marshal

transTransform := TransformTextResponse{}
transTransform.ErrorCondition = 0
transTransform.ErrorDescriptive = "OK"
transTransform.Sentence = "This is too strange i just want to go home soon"
jbody, _ := json.Marshal(transTransform)

I found the original data might have problem, so I try to compare two data in []byte format.

Data from response.Body:

[239 187 191 123 34 101 99 34 58 48 44 34 101 109 34 58 34 79 75 34 44 34 115 101 110 116 101 110 99 101 34 58 34 84 104 105 115 32 105 115 32 116 111 111 32 115 116 114 97 110 103 101 32 105 32 106 117 115 116 32 119 97 110 116 32 116 111 32 103 111 32 104 111 109 101 32 115 111 111 110 34 125]

Data from json.Marshal

[123 34 101 99 34 58 48 44 34 101 109 34 58 34 79 75 34 44 34 115 101 110 116 101 110 99 101 34 58 34 84 104 105 115 32 105 115 32 116 111 111 32 115 116 114 97 110 103 101 32 105 32 106 117 115 116 32 119 97 110 116 32 116 111 32 103 111 32 104 111 109 101 32 115 111 111 110 34 125]

Any idea how I parse this response.Body and Unmarshal it into data structure?

答案1

得分: 37

服务器正在向您发送一个带有字节顺序标记(BOM)的UTF-8文本字符串。BOM用于标识文本是UTF-8编码的,但在解码之前应将其删除。

可以使用以下代码行(使用包“bytes”)来完成:

body = bytes.TrimPrefix(body, []byte("\xef\xbb\xbf")) // 或 []byte{239, 187, 191}

附注:关于ï的错误是因为将UTF-8 BOM解释为ISO-8859-1字符串会产生字符

英文:

The server is sending you a UTF-8 text string with a Byte Order Mark (BOM). The BOM identifies that the text is UTF-8 encoded, but it should be removed before decoding.

This can be done with the following line (using package "bytes"):

body = bytes.TrimPrefix(body, []byte("\xef\xbb\xbf")) // Or []byte{239, 187, 191}

PS. The error referring to ï is because the UTF-8 BOM interpreted as an ISO-8859-1 string will produce the characters .

huangapple
  • 本文由 发表于 2015年7月14日 13:06:12
  • 转载请务必保留本文链接:https://go.coder-hub.com/31398044.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定