如果我将UTF-8数据发送到字符集为latin1的MySQL数据库中会发生什么?

huangapple go评论60阅读模式
英文:

What happens if I send UTF-8 data to a mySQL database with a character set of latin1?

问题

请解释这种情况:我有一个MySQL数据库,其连接设置为latin1,表和列的字符集也设置为latin1。我发送UTF-8编码的数据(例如,通过PHP编码的Web表单)。稍后我检索数据并在设置为使用UTF-8编码的Web页面上显示它。

我会看到我输入的内容吗?我知道对于ASCII字符会看到,但对于像ü(德语字母ü)这样的字符,以及更奇特的非Latin1字符呢?

实际上,我从WordPress向数据库发送UTF-8编码的日语字符的简单测试,然后在网页上查看它们,似乎没有问题。我怀疑没有进行转换,数据库只是存储它接收到的字节。但如果是这种情况,那么设置字符集的重要性是什么?这不是一个排序规则(collation),排序规则(我认为)与排序有关。

谢谢。

英文:

Please explain this situation: I have a mySQL database which is set to have its connection as latin1 and the character set of tables and columns to latin1. I send it UTF-8 encoded data (e.g. from a web form which encodes as UTF-8 via PHP). Later I retrieve that data and display it on a web page set to use UTF-8 encoding.

Will I see what I put in? I know I will for ASCI but how about for e.g. ü - a German umluat. And for a more exotic non Latin1 character?

In fact my simple test of sending UTF-8 encoded Japanese characters to the database from WordPress and then viewing them on a webpage suggests that there are no problems. I suspect that there is no conversion and that the database just stores the bytes it gets. But, if this is the case what is the significance of setting a character set? It is not a collation, which (I think) is to do with sorting.

Thank you

答案1

得分: 1

数据库将尝试将您的 latin1 连接数据转换为其内部编码(通常为 utf8mb4,在安装过程中确定)。这个过程很可能会损坏您的字符串。

即使转换成功,搜索或排序该列也将不可能。如果没有其他可用的选择,最好将您的UTF-8字符串存储为 varbinary 类型。

英文:

The database will attempt to convert your latin1 connection data to its internal encoding (typically utf8mb4, which is determined during installation). This process is likely to corrupt your string.

Even if the conversion were successful, searching or ordering this column would not be possible. If no alternatives are available, it would be preferable to store your UTF-8 string as a varbinary type.

答案2

得分: 1

查看 https://stackoverflow.com/questions/38363566/trouble-with-utf8-characters-what-i-see-is-not-what-i-stored

latin1 可以处理umlaut-u和其他西欧字符。但您必须告诉MySQL客户端使用latin1进行通信。并声明列为utf8mb4。(或者您所使用的任何组合。)

Latin1 无法处理任何亚洲字符集。

英文:

See https://stackoverflow.com/questions/38363566/trouble-with-utf8-characters-what-i-see-is-not-what-i-stored

latin1 can handle umlaut-u and other Western European characters. But you must tell MySQL that the client is talking latin1. And tell declare the columns to be utf8mb4. (Or whatever combination you have.)

Latin1 cannot handle any Asian character set.

huangapple
  • 本文由 发表于 2023年5月18日 05:00:29
  • 转载请务必保留本文链接:https://go.coder-hub.com/76276157.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定