Oracle中将字符串拆分为多行

2021-12-01 00:00:00 string sql tokenize oracle plsql

我知道 PHP 和 MYSQL 在某种程度上已经解决了这个问题,但我想知道是否有人可以教我在 Oracle 10g(最好)和 11g 中将字符串(逗号分隔)拆分为多行的最简单方法.

I know this has been answered to some degree with PHP and MYSQL, but I was wondering if someone could teach me the simplest approach to splitting a string (comma delimited) into multiple rows in Oracle 10g (preferably) and 11g.

表格如下:

Name | Project | Error 
108    test      Err1, Err2, Err3
109    test2     Err1

我想创建以下内容:

Name | Project | Error
108    Test      Err1
108    Test      Err2 
108    Test      Err3 
109    Test2     Err1

我已经看到了一些关于堆栈的潜在解决方案,但是它们只占一列(以逗号分隔的字符串).任何帮助将不胜感激.

I've seen a few potential solutions around stack, however they only accounted for a single column (being the comma delimited string). Any help would be greatly appreciated.

推荐答案

这可能是一种改进的方式(也可以使用 regexp 和 connect by):

This may be an improved way (also with regexp and connect by):

with temp as
(
    select 108 Name, 'test' Project, 'Err1, Err2, Err3' Error  from dual
    union all
    select 109, 'test2', 'Err1' from dual
)
select distinct
  t.name, t.project,
  trim(regexp_substr(t.error, '[^,]+', 1, levels.column_value))  as error
from 
  temp t,
  table(cast(multiset(select level from dual connect by  level <= length (regexp_replace(t.error, '[^,]+'))  + 1) as sys.OdciNumberList)) levels
order by name

编辑:这是对查询的简单(如不深入")解释.

EDIT: Here is a simple (as in, "not in depth") explanation of the query.

  1. length (regexp_replace(t.error, '[^,]+')) + 1 使用 regexp_replace 删除任何不是分隔符的东西(这里的逗号)case) 和 length +1 来获取有多少元素(错误).
  2. select level from dual connect by level <= (...) 使用分层查询来创建一个匹配数量不断增加的列发现,从 1 到错误总数.

  1. length (regexp_replace(t.error, '[^,]+')) + 1 uses regexp_replace to erase anything that is not the delimiter (comma in this case) and length +1 to get how many elements (errors) are there.
  2. The select level from dual connect by level <= (...) uses a hierarchical query to create a column with an increasing number of matches found, from 1 to the total number of errors.

预览:

select level, length (regexp_replace('Err1, Err2, Err3', '[^,]+'))  + 1 as max 
from dual connect by level <= length (regexp_replace('Err1, Err2, Err3', '[^,]+'))  + 1

  • table(cast(multiset(.....) as sys.OdciNumberList)) 执行一些预言机类型的转换.
    • cast(multiset(.....)) as sys.OdciNumberList 将多个集合(原始数据集中的每一行一个集合)转换为单个数字集合 OdciNumberList.
    • table() 函数将集合转换为结果集.
    • table(cast(multiset(.....) as sys.OdciNumberList)) does some casting of oracle types.
      • The cast(multiset(.....)) as sys.OdciNumberList transforms multiple collections (one collection for each row in the original data set) into a single collection of numbers, OdciNumberList.
      • The table() function transforms a collection into a resultset.
      • FROM 没有连接会在您的数据集和多重集之间创建交叉连接.结果,数据集中有 4 个匹配项的行将重复 4 次(在名为column_value"的列中数字增加).

        FROM without a join creates a cross join between your dataset and the multiset. As a result, a row in the data set with 4 matches will repeat 4 times (with an increasing number in the column named "column_value").

        预览:

        select * from 
        temp t,
        table(cast(multiset(select level from dual connect by  level <= length (regexp_replace(t.error, '[^,]+'))  + 1) as sys.OdciNumberList)) levels
        

      • trim(regexp_substr(t.error, '[^,]+', 1, levels.column_value)) 使用 column_value 作为 nth_appearance/regexp_substr 的 ocurrence 参数.
      • 您可以从数据集中添加一些其他列(例如,t.name, t.project)以方便可视化.
      • trim(regexp_substr(t.error, '[^,]+', 1, levels.column_value)) uses the column_value as the nth_appearance/ocurrence parameter for regexp_substr.
      • You can add some other columns from your data set (t.name, t.project as an example) for easy visualization.
      • 一些对 Oracle 文档的引用:

        Some references to Oracle docs:

        • REGEXP_REPLACE
        • REGEXP_SUBSTR
        • 可扩展性常量、类型和映射 (OdciNumberList)
        • CAST(多组)
        • 分层查询
  • 相关文章