如何对子查询的结果使用正则表达式?
我有两张桌子.
用户其中有 id 和 电话号码
id phone_no
1 ---- 9912678
1 ---- 9912678
2 ---- 9912323
2 ---- 9912323
3 ---- 9912366
3 ---- 9912366
准入表,有id 电话号码
id phone_no
6 --- 991267823
6 --- 991267823
7 --- 991236621
7 --- 991236621
8 --- 435443455
8 --- 435443455
9 --- 243344333
9 --- 243344333
我想查找与 users 表和 update 具有相同模式的 Admission's 表的所有 电话号码> 在用户表中.
I want to find all the phone number of Admission's table which has same pattern as users table and update it in users table.
所以我正在尝试这个
select phone_no from admission where phone_no REGEXP (SELECT phone_no
FROM `users` AS user
WHERE user.phone_no REGEXP '^(99)+[0-9]{8}')
但我收到此错误 子查询返回超过 1 行
寻求帮助.
推荐答案
尝试以下查询之一:
SELECT a.phone_no
FROM admission a
JOIN users u on a.phone_no LIKE concat(u.phone_no, '__')
WHERE u.phone_no REGEXP '^(99)+[0-9]+$'
或
SELECT a.phone_no
FROM admission a
JOIN users u on a.phone_no REGEXP concat('^', u.phone_no, '[0-9]{2}$')
WHERE u.phone_no REGEXP '^(99)+[0-9]+$'
如果尾数"的个数不固定,也可以使用:
If the number of "trailing digits" is not fixed, you can also use:
LIKE concat(u.phone_no, '%')
或
REGEXP concat('^', u.phone_no, '[0-9]*$')
但在这种情况下,如果 users.phone_no
可能是其他 users 的子序列,您可能需要使用
(例如 99123 和 991234).SELECT DISTICT a.phone_no
.phone_no
But in this case you might need to use SELECT DISTICT a.phone_no
if it is possible that a users.phone_no
is a subsequence of an other users.phone_no
(e.g. 99123 and 991234).
更新
在用 10K 行的用户表和 100K 行的准入表运行一些测试后,我得到了以下查询:
After running some tests with 10K rows for users table and 100K rows for admission table i came to the following query:
SELECT a.phone_no
FROM admission a
JOIN users u
ON a.phone_no >= u.phone_no
AND a.phone_no < CONCAT(u.phone_no, 'z')
AND a.phone_no LIKE CONCAT(u.phone_no, '%')
AND a.phone_no REGEXP CONCAT('^', u.phone_no, '[0-9]*$')
WHERE u.phone_no LIKE '99%'
AND u.phone_no REGEXP '^(99)+[0-9]*$'
UNION SELECT 0 FROM (SELECT 0) dummy WHERE 0
小提琴
这样你可以使用 REGEXP
并且仍然有很好的性能.此查询在我的测试用例中几乎立即执行.
This way you can use REGEXP
and still have great performance. This query executes almost instantly in my test case.
从逻辑上讲,您只需要 REGEXP 条件.但在更大的表上,查询可能会超时.使用 LIKE 条件将在 REGEXP 检查之前过滤结果集.但即使使用 LIKE 查询也不能很好地执行.由于某种原因,MySQL 不对连接使用范围检查.所以我添加了一个明确的范围检查:
Logically you only need the REGEXP conditions. But on bigger tables the query might time out. Using a LIKE condition will filter the result set before REGEXP check. But even using LIKE the query doesn't perform very well. For some reason MySQL doesn't use a range check for the join. So i added an explicit range check:
ON a.phone_no >= u.phone_no
AND a.phone_no < CONCAT(u.phone_no, 'z')
通过此检查,您可以从 JOIN 部分中删除 LIKE 条件.
With this check you can remove the LIKE condition from the JOIN part.
UNION 部分是 DISTICT 的替代品.MySQL 似乎将 DISTINCT 转换为 GROUP BY 语句,该语句表现不佳.使用带有空结果集的 UNION 我强制 MySQL 在 SELECT 之后删除重复项.如果您使用固定数量的尾随数字,则可以删除该行.
The UNION part is a replacement for DISTICT. MySQL seems to translate DISTINCT into a GROUP BY statement, which doesn't perform well. Using UNION with an empty result set i force MySQL to remove duplicates after the SELECT. You can remove that line, if you use a fixed number of trailing digits.
您可以根据需要调整 REGEXP 模式:
You can adjust the REGEXP patterns to your needs:
...
AND a.phone_no REGEXP CONCAT('^', u.phone_no, '[0-9]{2}$')
...
AND u.phone_no REGEXP '^(99)+[0-9]{8}$'
...
如果您只需要 REGEXP 来检查 phone_no 的长度,您还可以使用带有 '_' 占位符的 LIKE 条件.
If you only need REGEXP to check the length of the phone_no, you can also use a LIKE condition with the '_' placeholder.
AND a.phone_no LIKE CONCAT(u.phone_no, '__')
...
AND u.phone_no LIKE '99________$'
或将 LIKE 条件与 STR_LENGTH 检查结合起来.
or combine a LIKE condition with a STR_LENGTH check.
相关文章