Oracle 的 EXTRACT 函数是否破坏了 XMLELEMENT 中的 NOENTITYESCAPING?

2021-12-30 00:00:00 xml extract escaping oracle11g oracle

甲骨文 11g.我发现如果我将 NOENTITYESCAPING 添加到 XMLELEMENT 函数,它会很好地关闭实体转义.但是,当我将结果传递给 EXTRACT 时,转义似乎又回来了.

Oracle 11g. I figured out that if I add NOENTITYESCAPING to the XMLELEMENT function, it nicely turns off entity escaping. However, when I then pass the result to EXTRACT the escaping seems to come back again.

select xmlelement(NOENTITYESCAPING e,id,'->') 
  from (select level as id 
          from dual 
       connect by level < 6)      

XMLELEMENT(NOENTITYESCAPINGE,ID,'->')
---------------------------------------
<E>1-></E>
<E>2-></E>
<E>3-></E>
<E>4-></E>
<E>5-></E>

现在,添加EXTRACT:

select xmlelement(NOENTITYESCAPING e,id,'->').extract('//text()')
  from (select level as id 
          from dual 
       connect by level < 6)

XMLELEMENT(NOENTITYESCAPINGE,ID,'->').EXTRACT('//TEXT()')
----------------------------------------------------------
1-&gt;
2-&gt;
3-&gt;
4-&gt;
5-&gt;

是否有任何修复/解决方法来保持转义关闭?手册没有任何帮助.

Any fixes/workarounds to keep the escaping switched off? The manual gives no help.

推荐答案

尝试使用 extractvalue() 函数,而不是 extract().下面是一个例子:

Try to use extractvalue() function, which unescapes encoded entities, instead of extract(). Here is an example:

clear screen;
column res format a20;

-- depending on a situation, NOENTITYESCAPING might be dropped

select extractvalue(
                     xmlelement(NOENTITYESCAPING e,id,'->')
                    , '//text()'
                    ) as res
  from (select level as id 
          from dual 
       connect by level < 6)

结果:

RES                
--------------------
1->                  
2->                  
3->                  
4->                  
5->    

但是extractvalue()函数的使用可能受限于它只能返回一个节点的值.在返回多个节点的值的情况下,utl_i18n 包和该包的 unescape_reference() 函数可用于对编码实体进行转义:

But the use of extractvalue() function may be limited by the fact that it can return value of only one node. In a case of returning values of multiple nodes the utl_i18n package, and unescape_reference() function of that package can be used to unescape encoded entities:

clear screen;
column res format a20;

select utl_i18n.unescape_reference(xmlelement(root
                                             , xmlelement(node1, '>')
                                             , xmlelement(node2, '<')
                                             ).extract('//text()').getstringval()
                                   ) as res
 from dual
connect by level <= 3;

结果:

RES                
--------------------
><                   
><                   
>< 

<小时>

是的,因为 utl_i18n.unescape_reference() 函数只接受 varchar2 数据类型和可以隐式转换为 varchar2 的类型的值数据类型,在处理大的字符串"时,您的手是束手无策的.在这种情况下,您可以使用 dbms_xmlgen 包和 转换() 函数尤其如此,它有一个能够接受 CLOB 的重载版本.下面是一个例子:


Yes, as utl_i18n.unescape_reference() function accepts only values of varchar2 data type and types that can be implicitly converted to the varchar2 data type, your hands are tied when it comes to processing large "strings". In this situation you may turn to dbms_xmlgen package and convert() function in particular, which has an overloaded version capable of accepting CLOBs. Here is an example:

select dbms_xmlgen.convert(
                           xmlagg(xmlelement(root
                                             , xmlelement(node1, '>')
                                             , xmlelement(node2, '<')
                                             )
                                  ).extract('//text()').getclobval()
                          , 1) as res
 from dual
connect by level <= 3000;   -- 1 (second parameter of the convert() function)
                            -- instructs function to decode entities  

结果:

RES
------------------------------------------------------
><><><><><><><><><><><><><><><><><><><><><><><><><>
-- ... the rest of the CLOB

相关文章