如何使原始的Unicode编码内容可读?

huangapple go评论91阅读模式
英文:

How do I make raw unicode encoded content readable?

问题

我使用net/http请求了一个Web API,服务器返回了一个JSON响应。当我打印响应体时,它显示为原始的ASCII内容。我尝试使用bufio.ScanRunes来解析内容,但失败了。

我还尝试编写了一个简单的服务器并返回了一个Unicode字符串,结果运行良好。

以下是核心代码:

func (c ClientInfo) Request(method string, url string, form url.Values) string {
    req, _ := http.NewRequest(method, url, strings.NewReader(c.Encode(form)))
    req.Header = c.Header
    req.AddCookie(&c.Cookie)
    resp, err := http.DefaultClient.Do(req)
    defer resp.Body.Close()
    if err != nil {
        fmt.Println(err)
    }

    scanner := bufio.NewScanner(resp.Body)
    scanner.Split(bufio.ScanRunes)

    var buf bytes.Buffer
    for scanner.Scan() {
        buf.WriteString(scanner.Text())
    }
    rv := buf.String()
    fmt.Println(rv)
    return rv
}

以下是示例输出:

{"forum":{"id":"3251718","name":"合肥工业大学宣城校区","first_class":"高等院校","second_class":"安徽院校","is_like":"0","user_level":"1","level_id":"1","level_name":"素未谋面","cur_score":"0","levelup_score":"5","member_num":"80329","is_exists":"1","thread_num":"108762","post_num":"3445881","good_classify":[{"class_id":"0","class_name":"全部"},{"class_id":"1","class_name":"公告类"},{"class_id":"2","class_name":"吧友专区"},{"class_id":"4","class_name":"活动专区"},{"class_id":"6","class_name":"社团班级"},{"class_id":"5","class_name":"资源共享"},{"class_id":"8","class_name":"温馨生活类"},{"class_id":"7","class_name":"咨询新闻类"},{"class_id":"3","class_name":"风采展示区"}],"managers":[{"id":"793092593","name":"yi损明定的忧伤"},

...


以上是翻译好的内容。

<details>
<summary>英文:</summary>

I used `net/http` request a web API and the server returned a JSON response. When I print the response body, it displayed as raw ASCII content. I tried using `bufio.ScanRunes` to parse the content but failed.

I also tried write a simple server and return a unicode string and it worked well.

Here is the core code:

    func (c ClientInfo) Request(method string, url string, form url.Values) string {
    	req, _ := http.NewRequest(method, url, strings.NewReader(c.Encode(form)))
    	req.Header = c.Header
    	req.AddCookie(&amp;c.Cookie)
    	resp, err := http.DefaultClient.Do(req)
    	defer resp.Body.Close()
    	if err != nil {
    		fmt.Println(err)
    	}
    
    	scanner := bufio.NewScanner(resp.Body)
    	scanner.Split(bufio.ScanRunes)
    
    	var buf bytes.Buffer
    	for scanner.Scan() {
    		buf.WriteString(scanner.Text())
    	}
    	rv := buf.String()
    	fmt.Println(rv)
    	return rv
    }

Here is the example output:

&gt; {&quot;forum&quot;:{&quot;id&quot;:&quot;3251718&quot;,&quot;name&quot;:&quot;\u5408\u80a5\u5de5\u4e1a\u5927\u5b66\u5ba3\u57ce\u6821\u533a&quot;,&quot;first_class&quot;:&quot;\u9ad8\u7b49\u9662\u6821&quot;,&quot;second_class&quot;:&quot;\u5b89\u5fbd\u9662\u6821&quot;,&quot;is_like&quot;:&quot;0&quot;,&quot;user_level&quot;:&quot;1&quot;,&quot;level_id&quot;:&quot;1&quot;,&quot;level_name&quot;:&quot;\u7d20\u672a\u8c0b\u9762&quot;,&quot;cur_score&quot;:&quot;0&quot;,&quot;levelup_score&quot;:&quot;5&quot;,&quot;member_num&quot;:&quot;80329&quot;,&quot;is_exists&quot;:&quot;1&quot;,&quot;thread_num&quot;:&quot;108762&quot;,&quot;post_num&quot;:&quot;3445881&quot;,&quot;good_classify&quot;:[{&quot;class_id&quot;:&quot;0&quot;,&quot;class_name&quot;:&quot;\u5168\u90e8&quot;},{&quot;class_id&quot;:&quot;1&quot;,&quot;class_name&quot;:&quot;\u516c\u544a\u7c7b&quot;},{&quot;class_id&quot;:&quot;2&quot;,&quot;class_name&quot;:&quot;\u5427\u53cb\u4e13\u533a&quot;},{&quot;class_id&quot;:&quot;4&quot;,&quot;class_name&quot;:&quot;\u6d3b\u52a8\u4e13\u533a&quot;},{&quot;class_id&quot;:&quot;6&quot;,&quot;class_name&quot;:&quot;\u793e\u56e2\u73ed\u7ea7&quot;},{&quot;class_id&quot;:&quot;5&quot;,&quot;class_name&quot;:&quot;\u8d44\u6e90\u5171\u4eab&quot;},{&quot;class_id&quot;:&quot;8&quot;,&quot;class_name&quot;:&quot;\u6e29\u99a8\u751f\u6d3b\u7c7b&quot;},{&quot;class_id&quot;:&quot;7&quot;,&quot;class_name&quot;:&quot;\u54a8\u8be2\u65b0\u95fb\u7c7b&quot;},{&quot;class_id&quot;:&quot;3&quot;,&quot;class_name&quot;:&quot;\u98ce\u91c7\u5c55\u793a\u533a&quot;}],&quot;managers&quot;:[{&quot;id&quot;:&quot;793092593&quot;,&quot;name&quot;:&quot;yi\u62b9\u660e\u5a9a\u7684\u5fe7\u4f24&quot;},
&gt; 
&gt; ...



</details>


# 答案1
**得分**: 2

这只是转义任何Unicode字符的标准方法。

将其解组以查看未引用的文本([`json`][1]包将对其进行解引用):

```go
func main() {
    var i interface{}
    err := json.Unmarshal([]byte(src), &i)
    fmt.Println(err, i)
}

const src = `{"forum":{"id":"3251718","name":"合肥工业大学宣城校区","first_class":"高等院校","second_class":"安徽院校","is_like":"0","user_level":"1","level_id":"1","level_name":"素未谋面","cur_score":"0","levelup_score":"5","member_num":"80329","is_exists":"1","thread_num":"108762","post_num":"3445881","good_classify":[{"class_id":"0","class_name":"全部"},{"class_id":"1","class_name":"公告类"},{"class_id":"2","class_name":"吧友专区"},{"class_id":"4","class_name":"活动专区"},{"class_id":"6","class_name":"社团班级"},{"class_id":"5","class_name":"资源共享"},{"class_id":"8","class_name":"温馨生活类"},{"class_id":"7","class_name":"咨询新闻类"},{"class_id":"3","class_name":"风采展示区"}]}}`

输出(已修剪)(在Go Playground上尝试):

&lt;nil&gt; map[forum:map[levelup_score:5 is_exists:1 post_num:3445881 good_classify:[map[class_id:0 class_name:全部] map[class_id:1 class_name:公告类] map[class_id:2 class_name:吧友专区] map[class_id:4 class_name:活动专区] map[class_id:6 class_name:社团班级] map[class_id:5 class_name:资源共享] map[class_id:8 class_name:温馨生活类] map[class_name:咨询新闻类 class_id:7] map[class_id:3 class_name:风采展示区]] id:3251718 is_like:0 cur_score:0 

如果您只想取消引用片段,可以使用strconv.Unquote()

fmt.Println(strconv.Unquote(`"\u7d20\u672a\u8c0b"`))

输出(在Go Playground上尝试):

素未谋 <nil>

请注意,strconv.Unquote()期望一个带引号的string,这就是为什么我使用原始字符串字面量的原因,这样我可以添加引号,而且编译器本身不会解释/取消引用Unicode转义。

相关问题请参见:https://stackoverflow.com/questions/36528575/how-to-convert-escape-characters-in-html-tags/36529158#36529158

英文:

That is just the standard way to escape any Unicode character.

Unmarshal it to see the unquoted text (the json package will unquote it):

func main() {
	var i interface{}
	err := json.Unmarshal([]byte(src), &amp;i)
	fmt.Println(err, i)
}

const src = `{&quot;forum&quot;:{&quot;id&quot;:&quot;3251718&quot;,&quot;name&quot;:&quot;\u5408\u80a5\u5de5\u4e1a\u5927\u5b66\u5ba3\u57ce\u6821\u533a&quot;,&quot;first_class&quot;:&quot;\u9ad8\u7b49\u9662\u6821&quot;,&quot;second_class&quot;:&quot;\u5b89\u5fbd\u9662\u6821&quot;,&quot;is_like&quot;:&quot;0&quot;,&quot;user_level&quot;:&quot;1&quot;,&quot;level_id&quot;:&quot;1&quot;,&quot;level_name&quot;:&quot;\u7d20\u672a\u8c0b\u9762&quot;,&quot;cur_score&quot;:&quot;0&quot;,&quot;levelup_score&quot;:&quot;5&quot;,&quot;member_num&quot;:&quot;80329&quot;,&quot;is_exists&quot;:&quot;1&quot;,&quot;thread_num&quot;:&quot;108762&quot;,&quot;post_num&quot;:&quot;3445881&quot;,&quot;good_classify&quot;:[{&quot;class_id&quot;:&quot;0&quot;,&quot;class_name&quot;:&quot;\u5168\u90e8&quot;},{&quot;class_id&quot;:&quot;1&quot;,&quot;class_name&quot;:&quot;\u516c\u544a\u7c7b&quot;},{&quot;class_id&quot;:&quot;2&quot;,&quot;class_name&quot;:&quot;\u5427\u53cb\u4e13\u533a&quot;},{&quot;class_id&quot;:&quot;4&quot;,&quot;class_name&quot;:&quot;\u6d3b\u52a8\u4e13\u533a&quot;},{&quot;class_id&quot;:&quot;6&quot;,&quot;class_name&quot;:&quot;\u793e\u56e2\u73ed\u7ea7&quot;},{&quot;class_id&quot;:&quot;5&quot;,&quot;class_name&quot;:&quot;\u8d44\u6e90\u5171\u4eab&quot;},{&quot;class_id&quot;:&quot;8&quot;,&quot;class_name&quot;:&quot;\u6e29\u99a8\u751f\u6d3b\u7c7b&quot;},{&quot;class_id&quot;:&quot;7&quot;,&quot;class_name&quot;:&quot;\u54a8\u8be2\u65b0\u95fb\u7c7b&quot;},{&quot;class_id&quot;:&quot;3&quot;,&quot;class_name&quot;:&quot;\u98ce\u91c7\u5c55\u793a\u533a&quot;}]}}`

Output (trimmed) (try it on the Go Playground):

If you just want to unquote a fragment, you may use strconv.Unquote():

fmt.Println(strconv.Unquote(`&quot;\u7d20\u672a\u8c0b&quot;`))

Output (try it on the Go Playground):

素未谋 &lt;nil&gt;

Note that strconv.Unquote() expects a string that is in quotes, that's why I used a raw string literal, so I could add quotes, and also so that the compiler itself will not interpret / unquote the Unicode escapes.

See related question: https://stackoverflow.com/questions/36528575/how-to-convert-escape-characters-in-html-tags/36529158#36529158

huangapple
  • 本文由 发表于 2017年2月21日 19:21:43
  • 转载请务必保留本文链接:https://go.coder-hub.com/42365902.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定