XML Parsing - Illegal XML Character(在执行存储过程时,运行过程查询不会导致错误)
我有一个有效的 XML 文档(已使用多个 XML 验证器进行确认,包括在线验证器和 Sublime Text XML 验证器插件).
尝试使用名为 ImportNXML 的存储过程将 XML 文档导入 MSSQL 2008 时收到以下错误(命令:exec [dbo].[ImportNXML];)
I have a valid XML document (this has been confirmed using multiple XML validators including online validators and the Sublime Text XML validator plugin).
I receive the following error when attempting to import the XML document into MSSQL 2008 using a stored procedure named ImportNXML (command: exec [dbo].[ImportNXML];)
Msg 9420, Level 16, State 1, Line 2
XML parsing: line 17, character 35, illegal xml character
我已经确认 XML 文档中没有非法字符,第 17 行,字符 35 只是数字 1.我尝试修改这一行,用字母替换整行,用单个数字替换整行,在此行之前用字母/数字填充文档中的其他行,但我收到完全相同的错误,抱怨完全相同的位置.
如果我打开 ImportNXML 存储过程并运行查询内容,我根本不会收到任何错误.
什么可能导致存储过程在使用 'exec' 命令执行时失败,但在过程内容作为扩展查询执行时会成功?
前17行的mock数据如下:
I have confirmed no illegal characters are in the XML document and line 17, character 35 is just the number 1. I've tried modifying this line, replacing the entire line with letters, replacing the entire line with a single number, padding other lines in the document before this line with letters/numbers, but i receive exactly the same error complaining about the exact same location.
If i open the ImportNXML stored procedure and run the query contents, i receive no errors at all.
What could be causing the stored procedure to fail when being executed using the 'exec' command but succeed when the procedure contents are executed as an expanded query?
Mock data for the first 17 lines is as follows:
<?xml version="1.0" ?>
<ClientData>
<Policy><policyName>The Policy Name</policyName>
<Preferences><ServerPreferences><preference><name>Sessions</name>
<value>3</value>
</preference>
<preference><name>Detection</name>
<value>yes</value>
</preference>
<preference><name>Mac</name>
<value>no</value>
</preference>
<preference><name>Plugin</name>
<value>108478;84316;32809;93635;36080;87560;61117;35292;75260;83156;61271;103773;12899;82513;56376;77796;85655;60338;56763;79951;</value>
</preference>
<preference><name>TARGET</name>
<value>123.123.123.123,234.234.234.234</value>
导入 XML 的存储过程的部分如下:
The portion of the stored proc that imports the XML is as follows:
EXEC(' INSERT INTO XmlImportTest(xmlFileName, xml_data) SELECT ''' + @importPath + ''', xmlData FROM ( SELECT * FROM OPENROWSET (BULK ''' + @importPath + ''' , SINGLE_BLOB) AS XMLDATA ) AS FileImport (XMLDATA) ')
推荐答案
纯猜测:
- 该文件是
utf-8
编码的(或任何其他编码,SQL-Server 2008 无法本地读取).- 您必须知道,SQL-Server 的文件编码相当有限.
CHAR
(或VARCHAR
)是扩展的ASCII 1字节编码
和NCHAR
(或NVARCHAR
)code>) 是UCS-2 2 字节编码
(与UTF-16
几乎相同). - 在 SQL-Server 2016(以及 v2014 的 SP2)中引入了一些进一步的支持,尤其是对
utf-8
的支持. - 尝试使用适当的编辑器(例如记事本++)打开您的 XML 并尝试找出文件的编码.尝试将其保存为unicode/UCS-2/utf-16"并重试导入.
- 尝试使用
CLOB
而不是BLOB
的导入.以二进制LargeObject 形式读取文件将一个接一个地读取字节.SQL-Server 将尝试将这些字节读取为每个字符固定大小的字符串.字符 LOB 可能在特殊情况下起作用. - 检查
BOM
(字节顺序标记)的前两个字节
- The file is
utf-8
encoded (or any other encoding, SQL-Server 2008 cannot read natively).- You must know, that SQL-Server is rather limited with file encodings.
CHAR
(orVARCHAR
) isextended ASCII 1-byte encoding
andNCHAR
(orNVARCHAR
) isUCS-2 2-byte encoding
(which is almost identical withUTF-16
). - With SQL-Server 2016 (and SP2 for v2014) some further support was introduced, especially for
utf-8
. - Try to open your XML with an appropriate editor (e.g. notepad++) and try to find out the file's encoding. Try to save this as "unicode / UCS-2 / utf-16" and retry the import.
- Try to use your import with
CLOB
instead ofBLOB
. Reading the file as binary LargeObject will take the bytes one after the next. SQL-Server will try to read these bytes as string with fixed size per character. A character LOB might work under special circumstances. - Check the first two bytes for a
BOM
(byte order mark)
- 使用十六进制编辑器打开文件并尝试查找奇怪的代码
- 在这种情况下,有时您会遇到截断或断行引号
- 如果您导入数据并且预计会出现问题,强烈建议使用两步法
- 将您的文件读入一个容忍临时表(使用
NVARCHAR(MAX)
甚至VARBIANRY(MAX)
目标列)并尝试继续这个. - 在导入之前可能需要使用其他工具来更改您的文件.
- If you import data and you expect issues it is highly recommended to use a 2-step-approach
- Read your file into a tolerant staging table (with
NVARCHAR(MAX)
or evenVARBIANRY(MAX)
target columns) and try to continue with this. - It might be necessary to use another tool to change your file before the import.
- You must know, that SQL-Server is rather limited with file encodings.
- 您必须知道,SQL-Server 的文件编码相当有限.
相关文章