SQL Server XML 处理:根据 ID 加入不同的节点

2021-09-10 00:00:00 xml xpath tsql sql-server

我正在尝试使用 SQL 查询 XML.假设我有以下 XML.

<数据集数据><text>ABC</text></dataSetData><一般数据><id>123</id><text>文本数据</text></generalData><一般数据><id>456</id><text>文本数据2</text></generalData><特殊数据><id>123</id><text>特殊数据文本</text></specialData><特殊数据><id>456</id><text>特殊数据文本2</text></specialData></xml>

我想编写一个返回 2 行的 SELECT 查询,如下所示:

DataSetData |通用数据ID |通用数据文本 |特殊数据测试ABC |123 |文本数据 |特殊数据文本ABC |第456话文本数据 2 |特殊数据文本 2

我目前的做法如下:

SELECTdataset.nodes.value('(dataSetData/text)[1]', 'nvarchar(500)'),general.nodes.value('(generalData/text)[1]', 'nvarchar(500)'),special.nodes.value('(specialData/text)[1]', 'nvarchar(500)'),FROM @MyXML.nodes('xml') AS dataset(nodes)外部应用@MyXML.nodes('xml/generalData') AS general(nodes)外部应用@MyXML.nodes('xml/specialData') AS special(nodes)在哪里general.nodes.value('(generalData/text/id)[1]', 'nvarchar(500)') = special.nodes.value('(specialData/text/id)[1]', 'nvarchar(500))')

我不喜欢这里的是我必须使用 OUTER APPLY 两次,而且我必须使用 WHERE 子句来 JOIN正确的元素.

因此我的问题是: 是否有可能以我不必以这种方式使用 WHERE 子句的方式构造查询,因为我是可以肯定的是,如果文件变大,这会对性能产生非常负面的影响.

难道不能用一些XPATH语句JOIN正确的节点(即对应的generalDataspecialData节点)?

解决方案

您的 XPath 表达式完全关闭.

请尝试以下操作.这是非常有效的.您可以使用大型 XML 测试其性能.

<块引用>

SQL

-- DDL和样本数据填充,开始声明@xml XML =N'<xml><数据集数据><text>ABC</text></dataSetData><一般数据><id>123</id><text>文本数据</text></generalData><一般数据><id>456</id><text>文本数据2</text></generalData><特殊数据><id>123</id><text>特殊数据文本</text></specialData><特殊数据><id>456</id><text>特殊数据文本2</text></specialData></xml>';-- DDL和样本数据填充,结束SELECT c.value('(dataSetData/text/text())[1]', 'VARCHAR(20)') AS DataSetData, g.value('(id/text())[1]', 'INT') AS GeneralDataID, g.value('(text/text())[1]', 'VARCHAR(30)') AS GeneralDataText, sp.value('(id/text())[1]', 'INT') AS SpecialDataID, sp.value('(text/text())[1]', 'VARCHAR(30)') AS SpecialDataTestFROM @xml.nodes('/xml') AS t(c)外部应用 c.nodes('generalData') AS general(g)外部应用 c.nodes('specialData') AS special(sp)WHERE g.value('(id/text())[1]', 'INT') = sp.value('(id/text())[1]', 'INT');

<块引用>

输出

+-------------+---------------+-----------------+----------------------------+------------+|数据集数据 |通用数据ID |通用数据文本 |特殊数据ID |特殊数据测试 |+-------------+---------------+-----------------+---------------+--------------+|ABC |123 |文本数据 |123 |特殊数据文本 ||ABC |第456话文本数据 2 |第456话特殊数据文本 2 |+-------------+---------------+-----------------+---------------+--------------+

I am trying to query XML with SQL. Suppose I have the following XML.

<xml>
    <dataSetData>
        <text>ABC</text>
    </dataSetData>
    <generalData>
        <id>123</id>
        <text>text data</text>
    </generalData>
    <generalData>
        <id>456</id>
        <text>text data 2</text>
    </generalData>
    <specialData>
        <id>123</id>
        <text>special data text</text>
    </specialData>
    <specialData>
        <id>456</id>
        <text>special data text 2</text>
    </specialData>
</xml>

I want to write a SELECT query that returns 2 rows as follows:

DataSetData | GeneralDataID | GeneralDataText | SpecialDataTest
ABC         | 123           | text data       | special data text
ABC         | 456           | text data  2    | special data text 2

My current approach is as follows:

SELECT 
    dataset.nodes.value('(dataSetData/text)[1]', 'nvarchar(500)'),
    general.nodes.value('(generalData/text)[1]', 'nvarchar(500)'),
    special.nodes.value('(specialData/text)[1]', 'nvarchar(500)'),
FROM @MyXML.nodes('xml') AS dataset(nodes)
   OUTER APPLY @MyXML.nodes('xml/generalData') AS general(nodes)
   OUTER APPLY @MyXML.nodes('xml/specialData') AS special(nodes)
WHERE 
    general.nodes.value('(generalData/text/id)[1]', 'nvarchar(500)') = special.nodes.value('(specialData/text/id)[1]', 'nvarchar(500)')

What I do not like here is that I have to use OUTER APPLY twice and that I have to use the WHERE clause to JOIN the correct elements.

My question therefore is: Is it possible to construct the query in a way where I do not have to use the WHERE clause in such a way, because I am pretty sure that this affects performance very negatively if files become larger.

Shouldn't it be possible to JOIN the correct nodes (that is, the corresponding generalData and specialData nodes) with some XPATH statement?

解决方案

Your XPath expressions are completely off.

Please try the following. It is pretty efficient. You can test its performance with a large XML.

SQL

-- DDL and sample data population, start
DECLARE @xml XML = 
N'<xml>
    <dataSetData>
        <text>ABC</text>
    </dataSetData>
    <generalData>
        <id>123</id>
        <text>text data</text>
    </generalData>
    <generalData>
        <id>456</id>
        <text>text data 2</text>
    </generalData>
    <specialData>
        <id>123</id>
        <text>special data text</text>
    </specialData>
    <specialData>
        <id>456</id>
        <text>special data text 2</text>
    </specialData>
</xml>';
-- DDL and sample data population, end

SELECT c.value('(dataSetData/text/text())[1]', 'VARCHAR(20)') AS DataSetData
    , g.value('(id/text())[1]', 'INT') AS GeneralDataID 
    , g.value('(text/text())[1]', 'VARCHAR(30)') AS GeneralDataText
    , sp.value('(id/text())[1]', 'INT') AS SpecialDataID 
    , sp.value('(text/text())[1]', 'VARCHAR(30)') AS SpecialDataTest
FROM @xml.nodes('/xml') AS t(c)
    OUTER APPLY c.nodes('generalData') AS general(g)
    OUTER APPLY c.nodes('specialData') AS special(sp)
WHERE g.value('(id/text())[1]', 'INT') = sp.value('(id/text())[1]', 'INT');

Output

+-------------+---------------+-----------------+---------------+---------------------+
| DataSetData | GeneralDataID | GeneralDataText | SpecialDataID |   SpecialDataTest   |
+-------------+---------------+-----------------+---------------+---------------------+
| ABC         |           123 | text data       |           123 | special data text   |
| ABC         |           456 | text data 2     |           456 | special data text 2 |
+-------------+---------------+-----------------+---------------+---------------------+

相关文章