读取非UTF8编码的文件内容并正确打印出来

huangapple go评论89阅读模式
英文:

Read Non-UTF8 encoded file content and print out correctly

问题

我尝试读取非UTF8编码的文件并打印出内容。像这样:

content, _ := os.ReadFile("example.csv")
fmt.Println(string(content))

输出:

����ҭ��dzԪ�� �Ӻ��Ҵ�˭�

然后,我尝试将rune的内容转换并解码为utf8,像这样:

br := make([]rune, 0)
for len(content) > 0 {
    r, size := utf8.DecodeRune(content)
    br = append(br, r)
    content = content[size:]
}
fmt.Println(string(br))

但结果还是一样。我该如何获取正确的内容?
PS:我无法知道文件的编码类型,它们可以是多种类型,如raditionalchinese.Big5或japanese.ShiftJIS,并且内容不能是文件。它可以是一个字符串。

英文:

I try to read non-UTF8 encoded file and print out the content. Like:

content, _ := os.ReadFile("example.csv")
fmt.Println(string(content))

Output:

����ҭ��dzԪ�� �Ӻ��Ҵ�˭�

Then, I tried to convert the content of the rune and decode it to utf8 like:

br := make([]rune, 0)
for len(content) > 0 {
	r, size := utf8.DecodeRune(content)
	br = append(br, r)
	content = content[size:]
}
fmt.Println(string(br))

But the result was the same. How can I get right content?
PS: I can not know the file encoding type, they can be several type like raditionalchinese.Big5 or japanese.ShiftJIS and content must not be file. It can be a string.

答案1

得分: 1

最可能你需要从golang.org/x/text/encoding层次结构中获取包。

特别是,golang.org/x/text/encoding/charmap允许创建一个encoding.Decoder,能够将字节流从传统的非UTF-8编码转换为Go本地的UTF-8编码的数据流。

英文:

Most probably you need packages from the golang.org/x/text/encoding hierarchy.

In particular, the golang.org/x/text/encoding/charmap allows creating a encoding.Decoder able to translate stream of bytes in a legacy non-UTF-8 encoding to a UTF-8-encoded stream of data—native to Go.

huangapple
  • 本文由 发表于 2022年12月25日 01:27:35
  • 转载请务必保留本文链接:https://go.coder-hub.com/74909262.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定