Oracle 中 regexp_replace 与翻译的性能?

2021-12-24 00:00:00 regex performance sql oracle plsql

对于简单的事情,最好使用 translate 函数,前提是它占用的 CPU 更少,还是 regexp_replace 是要走的路?

For simple things is it better to use the translate function on the premise that it is less CPU intensive or is regexp_replace the way to go?

这个问题来自如何在 Oracle REGEXP_REPLACE 函数中将括号替换为连字符?

推荐答案

我认为您遇到了简单的优化问题.regexp 表达式的计算成本非常高,结果被缓存,希望将来能再次使用.如果你真的使用不同的字符串来转换,你会发现适度的翻译自然更快,因为它是它的专门功能.

I think you're running into simple optimization. The regexp expression is so expensive to compute that the result is cached in the hope that it will be used again in the future. If you actually use distinct strings to convert, you will see that the modest translate is naturally faster because it is its specialized function.

这是我的示例,在 11.1.0.7.0 上运行:

Here's my example, running on 11.1.0.7.0:

SQL> DECLARE
  2     TYPE t IS TABLE OF VARCHAR2(4000);
  3     l       t;
  4     l_level NUMBER := 1000;
  5     l_time  TIMESTAMP;
  6     l_char  VARCHAR2(4000);
  7  BEGIN
  8     -- init
  9     EXECUTE IMMEDIATE 'ALTER SESSION SET PLSQL_OPTIMIZE_LEVEL=2';
 10     SELECT dbms_random.STRING('p', 2000)
 11       BULK COLLECT
 12       INTO l FROM dual
 13     CONNECT BY LEVEL <= l_level;
 14     -- regex
 15     l_time := systimestamp;
 16     FOR i IN 1 .. l.count LOOP
 17        l_char := regexp_replace(l(i), '[]()[]', '-', 1, 0);
 18     END LOOP;
 19     dbms_output.put_line('regex     :' || (systimestamp - l_time));
 20     -- tranlate
 21     l_time := systimestamp;
 22     FOR i IN 1 .. l.count LOOP
 23        l_char := translate(l(i), '()[]', '----');
 24     END LOOP;
 25     dbms_output.put_line('translate :' || (systimestamp - l_time));
 26  END;
 27  /

regex     :+000000000 00:00:00.979305000
translate :+000000000 00:00:00.238773000

PL/SQL procedure successfully completed

11.2.0.3.0 上:

regex     :+000000000 00:00:00.617290000
translate :+000000000 00:00:00.138205000

结论:总的来说,我怀疑 translate 会赢.

Conclusion: In general I suspect translate will win.

相关文章