将字符串转换为xml并插入Sql Server

2022-01-16 00:00:00 tsql sql-server sql-server-2008-r2

我们有一个 SQL Server 2008 R2 数据库表,其中 XML 存储在 VARCHAR 数据类型的列中.

We have a SQL Server 2008 R2 database table with XML stored in a column of VARCHAR data type.

我现在必须获取 xml 的一些元素.

I now have to fetch some of the elements of the xml.

所以我想首先将存储为 VARCHAR 数据类型的 xml 转换为存储为 xml 数据类型的 xml.

So I want to first convert the xml stored as a VARCHAR data type, to an xml stored as xml data type.

例子:

表 A

Id(int) , ProductXML (varchar(max))

表 B

Id(int), ProductXML(XML)

我想将 Table A 中的 ProductXML 转换为 XML 数据类型并插入到 Table B 中.

I want to convert the ProductXML from Table A into XML data type and insert into Table B.

我尝试使用 CAST()CONVERT() 函数,如下所示:

I tried using the CAST() and CONVERT() function as shown below :

insert into TableB (ProductXML)
select CAST(ProductXML as XML) from TableA;

类似地尝试转换但我得到一个错误

Similarly tried convert but I get an error

XML 解析:无法切换编码

XML Parsing : unable to switch encoding

有什么方法可以将表中的 varchar 条目转换为 XML 条目?

Is there any way I can convert the varchar entries in the table into XML entries ?

关于XML:它很大,有很多节点,而且它的结构是动态变化的.

About the XML: it is huge with many nodes, and its structure changes dynamically.

示例:一行可以有一个产品的 XML 条目,另一行可以有多个产品的 xml 条目.

Example : One row can have and XML entry for 1 product and another row can have an xml entry for multiple products.

推荐答案

给我们一个你的 XML 样本,因为所有这些都可以工作:

Give us a sample of your XML as all these would work:

CONVERT(XML, '<root><child/></root>')
CONVERT(XML, '<root>          <child/>         </root>', 1)
CAST('<Name><FName>Carol</FName><LName>Elliot</LName></Name>'  AS XML)

您可能还必须先将其转换为 nvarchar 或 varbinary(来自 Microsoft 文档):

Also you might have to cast it to nvarchar or varbinary first (from Microsoft documentation):

您可以通过强制转换 (CAST) 或转换 (CONVERT) 将任何 SQL Server 字符串数据类型(例如 [n][var]char、[n]text、varbinary 和 image)解析为 xml 数据类型xml 数据类型的字符串.检查无类型 XML 以确认其格式正确.如果存在与 xml 类型关联的模式,则还会执行验证.有关详细信息,请参阅将类型化 XML 与非类型化 XML 进行比较.

You can parse any of the SQL Server string data types, such as [n][var]char, [n]text, varbinary,and image, into the xml data type by casting (CAST) or converting (CONVERT) the string to the xml data type. Untyped XML is checked to confirm that it is well formed. If there is a schema associated with the xml type, validation is also performed. For more information, see Compare Typed XML to Untyped XML.

XML 文档可以使用不同的编码(例如,UTF-8、UTF-16、windows-1252)进行编码.下面概述了有关字符串和二进制源类型如何与 XML 文档编码交互以及解析器行为方式的规则.

XML documents can be encoded with different encodings (for example, UTF-8, UTF-16, windows-1252). The following outlines the rules on how the string and binary source types interact with the XML document encoding and how the parser behaves.

由于 nvarchar 采用两字节 Unicode 编码,例如 UTF-16 或 UCS-2,XML 解析器会将字符串值视为两字节 Unicode 编码的 XML 文档或片段.这意味着 XML 文档需要以两字节 Unicode 编码进行编码,以便与源数据类型兼容.UTF-16 编码的 XML 文档可以具有 UTF-16 字节顺序标记 (BOM),但它不需要,因为源类型的上下文清楚地表明它只能是两个字节的 Unicode 编码文档.

Since nvarchar assumes a two-byte unicode encoding such as UTF-16 or UCS-2, the XML parser will treat the string value as a two-byte Unicode encoded XML document or fragment. This means that the XML document needs to be encoded in a two-byte Unicode encoding as well to be compatible with the source data type. A UTF-16 encoded XML document can have a UTF-16 byte order mark (BOM), but it does not need to, since the context of the source type makes it clear that it can only be a two-byte Unicode encoded document.

varchar 字符串的内容被 XML 解析器视为单字节编码的 XML 文档/片段.由于 varchar 源字符串具有关联的代码页,如果 XML 本身未指定显式编码,则解析器将使用该代码页进行编码 如果 XML 实例具有 BOM 或编码声明,则 BOM 或声明需要与代码页一致,否则解析器会报错.

The content of a varchar string is treated as a one-byte encoded XML document/fragment by the XML parser. Since the varchar source string has a code page associated, the parser will use that code page for the encoding if no explicit encoding is specified in the XML itself If an XML instance has a BOM or an encoding declaration, the BOM or declaration needs to be consistent with the code page, otherwise the parser will report an error.

varbinary 的内容被视为直接传递给 XML 解析器的代码点流.因此,XML 文档或片段需要内嵌提供 BOM 或其他编码信息.解析器只会查看流以确定编码.这意味着 UTF-16 编码的 XML 需要提供 UTF-16 BOM,并且没有 BOM 且没有声明编码的实例将被解释为 UTF-8.

The content of varbinary is treated as a codepoint stream that is passed directly to the XML parser. Thus, the XML document or fragment needs to provide the BOM or other encoding information inline. The parser will only look at the stream to determine the encoding. This means that UTF-16 encoded XML needs to provide the UTF-16 BOM and an instance without BOM and without a declaration encoding will be interpreted as UTF-8.

如果事先不知道 XML 文档的编码,并且在转换为 XML 之前将数据作为字符串或二进制数据而不是 XML 数据传递,则建议将数据视为 varbinary.例如,当使用 OpenRowset() 从 XML 文件中读取数据时,应将要读取的数据指定为 varbinary(max) 值:

If the encoding of the XML document is not known in advance and the data is passed as string or binary data instead of XML data before casting to XML, it is recommended to treat the data as varbinary. For example, when reading data from an XML file using OpenRowset(), one should specify the data to be read as a varbinary(max) value:

select CAST(x as XML) 
from OpenRowset(BULK 'filename.xml', SINGLE_BLOB) R(x)

SQL Server 在内部以使用 UTF-16 编码的高效二进制表示形式表示 XML.不保留用户提供的编码,但在解析过程中会考虑.

SQL Server internally represents XML in an efficient binary representation that uses UTF-16 encoding. User-provided encoding is not preserved, but is considered during the parse process.

解决方案:

CONVERT(XML, CONVERT(NVARCHAR(max), ProductXML))

相关文章