英文:
UTF8 characters incorrect after selecting from postgres
问题
我在Postgres中有一个包含电子邮件地址的数据库表。其中一个客户的电子邮件地址中有一个umlaut(ü)。这本应该不是问题,但不知何故,在Go中的字符串中包含了错误的字节序列(它是E3BC而不是C3BC),这在后续过程中给我带来了很多问题。
我使用client_encoding=UTF8
连接到数据库,数据库也设置为UTF8。如果我运行以下命令,我可以看到数据库中的字节序列与预期相符:
SELECT encode("email"::bytea, 'hex') FROM participants WHERE email like 'XXXXXX%';
encode
----------------------------------------------
c3bc
(其余数据已隐藏)
我使用database/sql包和Postgres驱动程序读取数据,如果我在Go中打印字符串,我得到的是XXXXXXe3bcXXXXXX,这不是我期望的结果(再次用X隐藏了电子邮件的其余部分)。
这是一个bug,还是我对某些事情有误解?
英文:
I have a database table in postgres containing email addresses. One of the customers has an umlaut (ü) in their email address. This shouldn't be an issue, but somehow the string in go contains the wrong byte sequence (it's E3BC instead of C3BC) which later on gives me a bunch of problems.
I'm connecting to the database with client_encoding=UTF8
and the database is set up for UTF8. If I run the following I can see the byte sequence is as expected in the database:
SELECT encode("email"::bytea, 'hex') FROM participants WHERE email like 'XXXXXX%';
encode
----------------------------------------------
c3bc
(the rest of the data has been hidden)
I read the data using the database/sql package and the postgres driver and if I print the string in go I get XXXXXXe3bcXXXXXX which is not what I expect (again, hiding the rest of the email with X's).
Is this a bug, or am I misunderstanding something?
答案1
得分: 0
确保你的数据库正确设置为UTF8。当创建数据库时,区域设置是固定的,可能会导致LOWER
等SQL函数出现问题。使用pg_dropcluster
和pg_createcluster --encoding=UTF8
重新创建数据库。
英文:
Make sure your database is correctly set up for UTF8. The locale settings are fixed when creating the database and might cause issues with sql functions like LOWER
. Re-create the database with pg_dropcluster
and pg_createcluster --encoding=UTF8
.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论