REGEXP_LIKE 中的 CHR(0)
我正在使用查询来检查 chr(0) 在 regexp_like 中的行为.
I am using the queries to check how chr(0) behaves in regexp_like.
CREATE TABLE t1(a char(10));
INSERT INTO t1 VALUES('0123456789');
SELECT CASE WHEN REGEXP_LIKE(a,CHR(0)) THEN 1 ELSE 0 END col, DUMP(a)
FROM t1;
我得到的输出 -
col dump(a)
----------- -----------------------------------
1 Typ=96 Len=10: 48,49,50,51,52,53,54,55,56,57
我很困惑,如果 dump(a) 中没有 chr(0),regexp_like 如何在列中找到 chr(0) 并返回 1?这里不应该返回0吗?
I am totally confused, if there is no chr(0) as shown by the dump(a), how regexp_like is finding the chr(0) in the column and returning 1? Shouldn't it return 0 here?
推荐答案
CHR(0)
是用于终止 C 编程语言(以及其他语言)中的字符串的字符.
CHR(0)
is the character used to terminate a string in the C programming language (among others).
当您将 CHR(0)
传递给函数时,它会依次将其传递给较低级别的函数,该函数将解析您传入的字符串并从该字符串构建正则表达式模式.这个正则表达式模式会看到 CHR(0)
并认为它是字符串终止符而忽略模式的其余部分.
When you pass CHR(0)
to the function it will, in turn, pass it to lower level function that will parse the strings you have passed in and build a regular expression pattern from that string. This regular expression pattern will see CHR(0)
and think it is the string terminator and ignore the rest of the pattern.
使用 REGEXP_REPLACE
更容易看到行为:
The behaviour is easier to see with REGEXP_REPLACE
:
SELECT REGEXP_REPLACE( 'abc' || CHR(0) || 'e', CHR(0), 'd' )
FROM DUAL;
运行时会发生什么:
CHR(0)
被编译成正则表达式,成为字符串终止符.- 现在模式只是字符串终止符,所以模式是一个零长度的字符串.
- 然后将正则表达式与输入字符串进行匹配,并读取第一个字符
a
并找到可以在a
之前匹配的零长度字符串,因此它替换它在a
之前与d
匹配的任何内容都给出了输出da
. - 然后将重复下一个字符将
b
转换为db
. - 依此类推,直到到达字符串末尾,此时它将匹配零长度模式并附加一个最终的
d.
CHR(0)
is compiled into a regular expression and become a string terminator.- Now the pattern is just the string terminator and so the pattern is a zero-length string.
- The regular expression is then matched against the input string and it reads the first character
a
and finds a zero-length string can be matched before thea
so it replaces the nothing it has matched before thea
with and
giving the outputda
. - It will then repeat for the next character transforming
b
todb
. - and so on until you reach the end-of-string when it will match the zero-length pattern and append a final
d
.
你会得到输出:
dadbdcd_ded
(其中 _ 是 CHR(0)
字符.)
(where _ is the CHR(0)
character.)
注意:输入中的CHR(0)
不会被替换.
Note: the CHR(0)
in the input is not replaced.
如果您使用的客户端程序也在 CHR(0)
处截断字符串,您可能看不到整个输出(这是您的客户端如何表示字符串的问题,而不是Oracle 的输出),但也可以使用 DUMP()
显示:
If the client program you are using is also truncating the string at CHR(0)
you may not see the entire output (this is an issue with how your client is representing the string and not with Oracle's output) but it can also be shown using DUMP()
:
SELECT DUMP( REGEXP_REPLACE( 'abc' || CHR(0) || 'e', CHR(0), 'd' ) )
FROM DUAL;
输出:
Typ=1 Len=11: 100,97,100,98,100,99,100,0,100,101,100
[TL;DR] 那么
REGEXP_LIKE( '1234567890', CHR(0) )
它将创建一个零长度字符串正则表达式模式,并在 1
字符之前寻找零长度匹配 - 它会找到并返回它已找到匹配.
It will make a zero-length string regular expression pattern and it will look for a zero-length match before the 1
character - which it will find and then return that it has found a match.
相关文章