如何从 MySQL 中的文本字段中提取两个连续数字?
我有一个 MySQL 数据库,我有一个查询:
I have a MySQL database and I have a query as:
SELECT `id`, `originaltext` FROM `source` WHERE `originaltext` regexp '[0-9][0-9]'
这会检测所有包含 2 位数字的原始文本.
This detects all originaltexts which have numbers with 2 digits in it.
我需要 MySQL 将这些数字作为字段返回,以便我可以进一步操作它们.
I need MySQL to return those numbers as a field, so i can manipulate them further.
理想情况下,如果我可以添加应该大于 20 的附加条件,那就太好了,但我也可以单独进行.
Ideally, if I can add additional criteria that is should be > 20 would be great, but i can do that separately as well.
推荐答案
如果你想在你的数据库中使用更多的正则表达式功能,你可以考虑使用 LIB_MYSQLUDF_PREG.这是一个导入 PCRE 库的 MySQL 用户函数的开源库.LIB_MYSQLUDF_PREG 仅以源代码形式提供.要使用它,您需要能够编译它并将其安装到您的 MySQL 服务器中.安装这个库不会以任何方式改变 MySQL 的内置正则表达式支持.它只是使以下附加功能可用:
If you want more regular expression power in your database, you can consider using LIB_MYSQLUDF_PREG. This is an open source library of MySQL user functions that imports the PCRE library. LIB_MYSQLUDF_PREG is delivered in source code form only. To use it, you'll need to be able to compile it and install it into your MySQL server. Installing this library does not change MySQL's built-in regex support in any way. It merely makes the following additional functions available:
PREG_CAPTURE 从字符串中提取正则表达式匹配项.PREG_POSITION 返回正则表达式匹配字符串的位置.PREG_REPLACE 对字符串执行搜索和替换.PREG_RLIKE 测试正则表达式是否匹配字符串.
PREG_CAPTURE extracts a regex match from a string. PREG_POSITION returns the position at which a regular expression matches a string. PREG_REPLACE performs a search-and-replace on a string. PREG_RLIKE tests whether a regex matches a string.
所有这些函数都将一个正则表达式作为它们的第一个参数.此正则表达式的格式必须类似于 Perl 正则表达式运算符.例如.要测试正则表达式是否不敏感地匹配主题大小写,您可以使用 MySQL 代码 PREG_RLIKE('/regex/i', subject).这类似于 PHP 的 preg 函数,它也需要额外的//分隔符用于 PHP 字符串中的正则表达式.
All these functions take a regular expression as their first parameter. This regular expression must be formatted like a Perl regular expression operator. E.g. to test if regex matches the subject case insensitively, you'd use the MySQL code PREG_RLIKE('/regex/i', subject). This is similar to PHP's preg functions, which also require the extra // delimiters for regular expressions inside the PHP string.
如果您想要更简单的东西,您可以更改此功能以更好地满足您的需求.
If you want something more simpler, you could alter this function to suit better your needs.
CREATE FUNCTION REGEXP_EXTRACT(string TEXT, exp TEXT)
-- Extract the first longest string that matches the regular expression
-- If the string is 'ABCD', check all strings and see what matches: 'ABCD', 'ABC', 'AB', 'A', 'BCD', 'BC', 'B', 'CD', 'C', 'D'
-- It's not smart enough to handle things like (A)|(BCD) correctly in that it will return the whole string, not just the matching token.
RETURNS TEXT
DETERMINISTIC
BEGIN
DECLARE s INT DEFAULT 1;
DECLARE e INT;
DECLARE adjustStart TINYINT DEFAULT 1;
DECLARE adjustEnd TINYINT DEFAULT 1;
-- Because REGEXP matches anywhere in the string, and we only want the part that matches, adjust the expression to add '^' and '$'
-- Of course, if those are already there, don't add them, but change the method of extraction accordingly.
IF LEFT(exp, 1) = '^' THEN
SET adjustStart = 0;
ELSE
SET exp = CONCAT('^', exp);
END IF;
IF RIGHT(exp, 1) = '$' THEN
SET adjustEnd = 0;
ELSE
SET exp = CONCAT(exp, '$');
END IF;
-- Loop through the string, moving the end pointer back towards the start pointer, then advance the start pointer and repeat
-- Bail out of the loops early if the original expression started with '^' or ended with '$', since that means the pointers can't move
WHILE (s <= LENGTH(string)) DO
SET e = LENGTH(string);
WHILE (e >= s) DO
IF SUBSTRING(string, s, e) REGEXP exp THEN
RETURN SUBSTRING(string, s, e);
END IF;
IF adjustEnd THEN
SET e = e - 1;
ELSE
SET e = s - 1; -- ugh, such a hack to end it early
END IF;
END WHILE;
IF adjustStart THEN
SET s = s + 1;
ELSE
SET s = LENGTH(string) + 1; -- ugh, such a hack to end it early
END IF;
END WHILE;
RETURN NULL;
END
相关文章