Incorrect string value: \x[XX] when converting the column from VARBINARY to VARCHAR

huangapple go评论56阅读模式
英文:

Incorrect string value: \x[XX] when converting the column from VARBINARY to VARCHAR

问题

After upgrading MySQL 5.7 -> 8.0, we realized we needed to convert all tables from latin1 to utf8. After a long googling, I found that the best solution for that is to convert the columns with text datatypes first to VARBINARY (or BLOB for texts) and then back to VARCHAR (text) with the correct CHARACTER SET.

Running the command:
ALTER TABLE [table] MODIFY COLUMN [column] VARBINARY([size]);
runs fine and the column got changed to the VARBINARY type.

But the conversion back:
ALTER TABLE [table] MODIFY COLUMN [column] VARCHAR([size]) CHARACTER SET utf8mb4;
returns an error:
Incorrect string value: 'xF3nica' for column '[column]' at row [index]
The value is random across tables and columns, but every time it happens if the field has some non-ANSI character (like accents, non-English letters, etc.). This particular error happens with the first name column containing Mónica (xF3 = ó). Why it incorrect for UTF8mb4?

I'm also frustrated about this error because when we used to run MySQL 5.7 with the default character set latin1, we had no errors with non-Latin characters. Everything was working fine. The problem started after the MySQL version upgrade.

英文:

After upgrading MySQL 5.7 -> 8.0, we realized we needed to convert all tables from latin1 to utf8. After a long googling, I found that the best solution for that is to convert the columns with text datatypes first to VARBINARY (or BLOB for texts) and then back to VARCHAR (text) with the correct CHARACTER SET.

Running the command:

ALTER TABLE [table] MODIFY COLUMN [column] VARBINARY([size]);

runs fine and the column got changed to the VARBINARY type.

But the conversion back:

ALTER TABLE [table] MODIFY COLUMN [column] VARCHAR([size]) CHARACTER SET utf8mb4;

returns an error:

Incorrect string value: '\xF3nica' for column '[column]' at row [index]

The value is random across tables and columns, but every time it happens if the field has some non-ANSI character (like accents, non-English letters, etc.). This particular error happens with the first name column containing Mónica (xF3 = ó). Why it incorrect for UTF8mb4?

I'm also frustrated about this error because when we used to run MySQL 5.7 with the default character set latin1, we had no errors with non-Latin characters. Everything was working fine. The problem started after the MySQL version upgrade.

答案1

得分: 1

以下是您要翻译的内容:

在5.7版本中,默认字符集是latin1,但在8.0版本中是utf8mb4。如果您明确将连接的字符集设置为latin1,它将与升级前完全相同,因为连接和列之间的字符集不同,所以不会发生隐式转换。

要将列更改为utf8mb4,请尝试在更改列数据类型之前转换列内容:

ALTER TABLE [table] MODIFY COLUMN [column] VARBINARY([size]);
UPDATE [table] SET [column] = CONVERT([column] USING utf8mb4);
ALTER TABLE [table] MODIFY COLUMN [column] VARCHAR([size]) CHARACTER SET utf8mb4;

您应该能够跳过VARBINARY步骤:

UPDATE [table] SET [column] = CONVERT(CAST([column] AS BINARY) USING utf8mb4);
ALTER TABLE [table] MODIFY COLUMN [column] VARCHAR(50) CHARACTER SET utf8mb4;
英文:

The default character set in 5.7 was latin1, but is utf8mb4 in 8.0. If you explicitly set the character set for your connection to latin1, it would function exactly as it did prior to the upgrade, as there would be no implicit conversion due to the differing character sets between connection and column.

To get the column changed to utf8mb4, try converting the column content before changing the column data type:

ALTER TABLE [table] MODIFY COLUMN [column] VARBINARY([size]);
UPDATE [table] SET [column] = CONVERT([column] USING utf8mb4);
ALTER TABLE [table] MODIFY COLUMN [column] VARCHAR([size]) CHARACTER SET utf8mb4;

You should be able to skip the VARBINARY step:

UPDATE [table] SET [column] = CONVERT(CAST([column] AS BINARY) USING utf8mb4);
ALTER TABLE [table] MODIFY COLUMN [column] VARCHAR(50) CHARACTER SET utf8mb4;

huangapple
  • 本文由 发表于 2023年6月12日 15:53:00
  • 转载请务必保留本文链接:https://go.coder-hub.com/76454574.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定