英文:
What happens if I send UTF-8 data to a mySQL database with a character set of latin1?
问题
请解释这种情况:我有一个MySQL数据库,其连接设置为latin1,表和列的字符集也设置为latin1。我发送UTF-8编码的数据(例如,通过PHP编码的Web表单)。稍后我检索数据并在设置为使用UTF-8编码的Web页面上显示它。
我会看到我输入的内容吗?我知道对于ASCII字符会看到,但对于像ü(德语字母ü)这样的字符,以及更奇特的非Latin1字符呢?
实际上,我从WordPress向数据库发送UTF-8编码的日语字符的简单测试,然后在网页上查看它们,似乎没有问题。我怀疑没有进行转换,数据库只是存储它接收到的字节。但如果是这种情况,那么设置字符集的重要性是什么?这不是一个排序规则(collation),排序规则(我认为)与排序有关。
谢谢。
英文:
Please explain this situation: I have a mySQL database which is set to have its connection as latin1 and the character set of tables and columns to latin1. I send it UTF-8 encoded data (e.g. from a web form which encodes as UTF-8 via PHP). Later I retrieve that data and display it on a web page set to use UTF-8 encoding.
Will I see what I put in? I know I will for ASCI but how about for e.g. ü - a German umluat. And for a more exotic non Latin1 character?
In fact my simple test of sending UTF-8 encoded Japanese characters to the database from WordPress and then viewing them on a webpage suggests that there are no problems. I suspect that there is no conversion and that the database just stores the bytes it gets. But, if this is the case what is the significance of setting a character set? It is not a collation, which (I think) is to do with sorting.
Thank you
答案1
得分: 1
数据库将尝试将您的 latin1
连接数据转换为其内部编码(通常为 utf8mb4
,在安装过程中确定)。这个过程很可能会损坏您的字符串。
即使转换成功,搜索或排序该列也将不可能。如果没有其他可用的选择,最好将您的UTF-8字符串存储为 varbinary
类型。
英文:
The database will attempt to convert your latin1
connection data to its internal encoding (typically utf8mb4
, which is determined during installation). This process is likely to corrupt your string.
Even if the conversion were successful, searching or ordering this column would not be possible. If no alternatives are available, it would be preferable to store your UTF-8 string as a varbinary
type.
答案2
得分: 1
latin1 可以处理umlaut-u和其他西欧字符。但您必须告诉MySQL客户端使用latin1进行通信。并声明列为utf8mb4。(或者您所使用的任何组合。)
Latin1 无法处理任何亚洲字符集。
英文:
latin1 can handle umlaut-u and other Western European characters. But you must tell MySQL that the client is talking latin1. And tell declare the columns to be utf8mb4. (Or whatever combination you have.)
Latin1 cannot handle any Asian character set.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论