规范化 SQL Server 中的 unicode 字符串?
SQL Server 中是否有规范化 unicode 字符串的函数?例如
Is there a function in SQL Server to normalize a unicode string? e.g.
UPDATE Orders SET Notes = NormalizeString(Notes, 'FormC')
Unicode 规范化形式:
Unicode Normalization Forms:
- C组合(C):
A
+¨
变为Ä代码>
- Decomposition (D):
Ä
变成A
+¨代码>
- 兼容组合(KC):
A
+¨
+fi
+n
变成Ä
+f
+i
+n
- 兼容分解(KD):
Ä
+fi
+n
变成A
+¨
+f
+i
+n
- Composition (C):
A
+¨
becomesÄ
- Decomposition (D):
Ä
becomesA
+¨
- Compatible Composition (KC):
A
+¨
+fi
+n
becomesÄ
+f
+i
+n
- Compatible Decomposition (KD):
Ä
+fi
+n
becomesA
+¨
+f
+i
+n
我找不到任何内置函数,所以我假设没有.
i cannot find any built-in function, so i assume there is none.
理想情况下,如果只能有一个,那么我今天恰好需要Form C:
Ideally, if there can be only one, then i happen to need Form C today:
Unicode 规范化形式 C,规范组合.将每个分解的分组(由一个基本字符加上组合字符组成)转换为规范的预组合等价物.例如,A + ¨ 变成 Ä.
Unicode normalization form C, canonical composition. Transforms each decomposed grouping, consisting of a base character plus combining characters, to the canonical precomposed equivalent. For example, A + ¨ becomes Ä.
另见
- Windows 中的 Unicode 规范化
- 如何删除变音符号(重音) 来自 .NET 中的字符串?
- NormalizeString 函数
- 整理一下:SQL Server 使用什么规范化形式
推荐答案
抱歉,不,迄今为止的任何版本的 SQL Server(2012 测试版本)中都没有这样的功能.比较可以正确地对组合不敏感,但没有将字符组合用法转换为一种正常形式的功能.
Sorry, no, there is no such function in any version of SQL Server to date (2012 test builds). Comparisons can be correctly composition-insensitive, but there isn't a function to convert character composition usage to one normal form.
已建议在语法 NORMALIZE(string, NFC)
下为未来版本的 ANSI 标准提供,但要在现实世界中实现它还需要很长时间.目前,如果您想进行规范化,您必须使用具有更好字符串处理能力的适当编程语言来完成,方法是从数据库中提取字符串或编写 CLR 存储过程来完成.
It has been suggested for a future version of the ANSI standard under the syntax NORMALIZE(string, NFC)
but it's going to be a long time before this makes it to the real world. For now if you want to do normalisation you'll have to do it in a proper programming language with better string-handling capabilities, either by pulling the string out of the database or by writing a CLR stored procedure to do it.
相关文章