GoLang – 使用ISO-8859-1字符集进行持久化

huangapple go评论85阅读模式
英文:

GoLang - Persist using ISO-8859-1 charset

问题

我正在开发一个项目,需要将信息持久化到一个使用ISO-8859-1编码的遗留数据库中。因此,在将数据写入数据库之前,我需要将其从UTF-8转换为ISO-8859-1编码;每次从数据库中检索数据时,我需要将其转换回UTF-8编码。

我尝试使用库**code.google.com/p/go-charset/**来处理每个需要持久化的文本字段,代码如下:

import (
	"bytes"
	"code.google.com/p/go-charset/charset"
	_ "code.google.com/p/go-charset/data"
	"fmt"
	"io/ioutil"
	"strings"
)

func toISO88591(utf8 string) string {
	buf := new(bytes.Buffer)

	w, err := charset.NewWriter("latin1", buf)
	if err != nil {
		panic(err)
	}
	defer w.Close()

	fmt.Fprintf(w, utf8)
	return buf.String()
}

func fromISO88591(iso88591 string) string {
	r, err := charset.NewReader("latin1", strings.NewReader(iso88591))
	if err != nil {
		panic(err)
	}

	buf, err := ioutil.ReadAll(r)
	if err != nil {
		panic(err)
	}

	return string(buf)
}

问题是,即使我使用了toISO88591函数,数据仍然以UTF-8编码持久化。我在转换过程中做错了什么吗?

我的数据库是MySQL,我正在使用github.com/go-sql-driver/mysql驱动程序,并使用以下连接参数:

<user>:<password>@tcp(<host>:<port>)/<database>?collation=latin1_general_ci

最好的祝福!

英文:

I'm developing a project where we need to persist our information in a legacy database that has ISO-8859-1 tables. So before writing something to the database I need to convert it from UTF-8 to ISO-8859-1, and every time I retrieve it from the database, I need to convert it back to UTF-8.

I was trying to use the library code.google.com/p/go-charset/ as the following for each text field that I need to persist.

import (
  &quot;bytes&quot;
  &quot;code.google.com/p/go-charset/charset&quot;
  _ &quot;code.google.com/p/go-charset/data&quot;
  &quot;fmt&quot;
  &quot;io/ioutil&quot;
  &quot;strings&quot;
)

func toISO88591(utf8 string) string {
	buf := new(bytes.Buffer)

	w, err := charset.NewWriter(&quot;latin1&quot;, buf)
	if err != nil {
		panic(err)
	}
	defer w.Close()

	fmt.Fprintf(w, utf8)
	return buf.String()
}

func fromISO88591(iso88591 string) string {
	r, err := charset.NewReader(&quot;latin1&quot;, strings.NewReader(iso88591))
	if err != nil {
		panic(err)
	}

	buf, err := ioutil.ReadAll(r)
	if err != nil {
		panic(err)
	}

	return string(buf)
}

The problem is that the data is still persisted in UTF-8 even if I use the function toISO88591. I am doing something wrong in this conversion?

My database is a MySQL, and I'm using the github.com/go-sql-driver/mysql driver with the following connection parameters:

&lt;user&gt;:&lt;password&gt;@tcp(&lt;host&gt;:&lt;port&gt;)/&lt;database&gt;?collation=latin1_general_ci

Best regards!

答案1

得分: 7

> package charset

> import "code.google.com/p/go-charset/charset"

>

> func NewWriter

>

> func NewWriter(charset string, w io.Writer) (io.WriteCloser, error)

>

> NewWriter返回一个新的WriteCloser,用于向w写入数据。它将UTF-8文本的写入转换为以指定字符集编码的文本写入wClose方法用于刷新任何剩余的部分转换字符到输出。


我会按照指示执行:"Close方法用于刷新任何剩余的部分转换字符到输出。" 例如,

package main

import (
	&quot;bytes&quot;
	&quot;code.google.com/p/go-charset/charset&quot;
	_ &quot;code.google.com/p/go-charset/data&quot;
	&quot;fmt&quot;
	&quot;io/ioutil&quot;
	&quot;strings&quot;
)

func toISO88591(utf8 string) (string, error) {
	buf := new(bytes.Buffer)
	w, err := charset.NewWriter(&quot;latin1&quot;, buf)
	if err != nil {
		return &quot;&quot;, err
	}
	fmt.Fprintf(w, utf8)
	w.Close()
	return buf.String(), nil
}

func fromISO88591(iso88591 string) (string, error) {
	r, err := charset.NewReader(&quot;latin1&quot;, strings.NewReader(iso88591))
	if err != nil {
		return &quot;&quot;, err
	}
	buf, err := ioutil.ReadAll(r)
	if err != nil {
		return &quot;&quot;, err
	}
	return string(buf), nil
}

func main() {
	utfi := &quot;&#163;5 for Pepp&#233;&quot;
	fmt.Printf(&quot;%q\n&quot;, utfi)
	iso, err := toISO88591(utfi)
	if err != nil {
		fmt.Println(err)
	}
	fmt.Printf(&quot;%q\n&quot;, iso)
	utfo, err := fromISO88591(iso)
	if err != nil {
		fmt.Println(err)
	}
	fmt.Printf(&quot;%q\n&quot;, utfo)
	fmt.Println(utfi == utfo)
}

输出:

&quot;&#163;5 for Pepp&#233;&quot;
&quot;\xa35 for Pepp\xe9&quot;
&quot;&#163;5 for Pepp&#233;&quot;
true
英文:

> package charset
>
> import "code.google.com/p/go-charset/charset"
>
> func NewWriter
>
> func NewWriter(charset string, w io.Writer) (io.WriteCloser, error)
>
> NewWriter returns a new WriteCloser writing to w. It converts
> writes of UTF-8 text into writes on w of text in the named character
> set. The Close is necessary to flush any remaining partially
> translated characters to the output.


I would follow the instructions: "The Close is necessary to flush any remaining partially
translated characters to the output." For example,

package main

import (
	&quot;bytes&quot;
	&quot;code.google.com/p/go-charset/charset&quot;
	_ &quot;code.google.com/p/go-charset/data&quot;
	&quot;fmt&quot;
	&quot;io/ioutil&quot;
	&quot;strings&quot;
)

func toISO88591(utf8 string) (string, error) {
	buf := new(bytes.Buffer)
	w, err := charset.NewWriter(&quot;latin1&quot;, buf)
	if err != nil {
		return &quot;&quot;, err
	}
	fmt.Fprintf(w, utf8)
	w.Close()
	return buf.String(), nil
}

func fromISO88591(iso88591 string) (string, error) {
	r, err := charset.NewReader(&quot;latin1&quot;, strings.NewReader(iso88591))
	if err != nil {
		return &quot;&quot;, err
	}
	buf, err := ioutil.ReadAll(r)
	if err != nil {
		return &quot;&quot;, err
	}
	return string(buf), nil
}

func main() {
	utfi := &quot;&#163;5 for Pepp&#233;&quot;
	fmt.Printf(&quot;%q\n&quot;, utfi)
	iso, err := toISO88591(utfi)
	if err != nil {
		fmt.Println(err)
	}
	fmt.Printf(&quot;%q\n&quot;, iso)
	utfo, err := fromISO88591(iso)
	if err != nil {
		fmt.Println(err)
	}
	fmt.Printf(&quot;%q\n&quot;, utfo)
	fmt.Println(utfi == utfo)
}

Output:

&quot;&#163;5 for Pepp&#233;&quot;
&quot;\xa35 for Pepp\xe9&quot;
&quot;&#163;5 for Pepp&#233;&quot;
true

huangapple
  • 本文由 发表于 2014年7月3日 22:02:09
  • 转载请务必保留本文链接:https://go.coder-hub.com/24555819.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定