在 MySQL 中检测 utf8 损坏的字符

2021-12-28 00:00:00 utf-8 mysql

我有一个数据库，其中有一堆散落在多个表中的损坏的 utf8 字符.字符列表不是很广泛 AFAIK (áéíúóÁÉÍÓÚÑñ)

I've got a database with a bunch of broken utf8 characters scattered across several tables. The list of characters isn't very extensive AFAIK (áéíúóÁÉÍÓÚÑñ)

修复给定的表非常简单

update orderItem set itemName=replace(itemName,'Ã¡','á');

但是我无法找到检测损坏字符的方法.如果我做类似的事情

But I can't get a way of detecting the broken characters. If I do something like

SELECT * FROM TABLE WHERE field LIKE "%Ã%";

由于排序规则 (Ã=a)，我获得了几乎所有字段.到目前为止，所有损坏的字符都以Ã"开头.数据库是西班牙语的，所以不使用这个特定的字符

I get nearly all the fields because of the collation (Ã=a). All broken characters so far start with an "Ã". The database is in spanish so this particular character isn't used

到目前为止我得到的损坏字符列表是

The list of broken chars I've got so far is

Ã¡ = á Ã© = é Ã- = í Ã³ = ó Ã± = ñ Ã¡ = Á

知道如何使这个 SELECT 按预期工作吗?(二进制搜索或类似的东西)

Any idea of how to make this SELECT to work as intended? (a binary search or something like that)

推荐答案

另一种方法如何，即来回转换列以获得正确的字符集?您可以将其转换为二进制，然后转换为 utf-8，然后转换为 iso-8859-1 或您正在使用的任何其他内容.有关详细信息，请参阅手册.

How about a different approach, namely converting the column back and forth to get the correct character set? You can convert it to binary, then to utf-8 and then to iso-8859-1 or whatever else you're using. See the manual for the details.

相关文章