sql 查询 - 如何在组内应用限制
我有一个名为 t1 的表,其中包含以下字段:ROWID、CID、PID、Score、SortKey
I have a table named t1 with following fields: ROWID, CID, PID, Score, SortKey
它有以下数据:
1, C1, P1, 10, 1
2, C1, P2, 20, 2
3, C1, P3, 30, 3
4, C2, P4, 20, 3
5, C2, P5, 30, 2
6, C3, P6, 10, 1
7, C3, P7, 20, 2
我写什么查询以便它在 CID 上应用 group by,但不是每组返回 1 个结果,而是每组最多返回 2 个结果.还有条件是 score >= 20 并且我想要按 CID 和 SortKey 排序的结果.
what query do I write so that it applies group by on CID, but instead of returning me 1 single result per group, it returns me a max of 2 results per group. also where condition is score >= 20 and I want the results ordered by CID and SortKey.
如果我必须对上述数据运行查询,我希望得到以下结果:
If I had to run my query on above data, I would expect the following result:
C1 的结果 - 注意:ROWID 1 不被视为其分数 <20
C1, P2, 20, 2
C1, P3, 30, 3
C2 的结果 - 注意:ROWID 5 出现在 ROWID 4 之前,因为 ROWID 5 的值较小排序键
C2, P5, 30, 2
C2, P4, 20, 3
C3 的结果 - 注意:ROWID 6 没有出现,因为它的分数小于 20 所以这里只返回 1 条记录
C3, P7, 20, 2
简而言之,我想要一个 GROUP BY 内的限制.我想要最简单的解决方案,并希望避免临时表.子查询很好.另请注意,我为此使用了 SQLite.
IN SHORT, I WANT A LIMIT WITHIN A GROUP BY. I want the simplest solution and want to avoid temp tables. sub queries are fine. Also note I am using SQLite for this.
推荐答案
这是一个相当可移植的查询来做你想做的事:
Here's a fairly portable query to do what you want:
SELECT *
FROM table1 a
WHERE a."ROWID" IN (
SELECT b."ROWID"
FROM table1 b
WHERE b."Score" >= 20
AND b."ROWID" IS NOT NULL
AND a."CID" = b."CID"
ORDER BY b."CID", b."SortKey"
LIMIT 2
)
ORDER BY a."CID", a."SortKey";
该查询使用带有排序和限制的相关子查询来生成应出现在最终结果中的 ROWID
列表.因为对每一行都执行相关子查询,无论它是否包含在结果中,它可能不如下面给出的窗口函数版本那么有效 - 但与那个版本不同,它可以在不支持窗口的 SQLite3 上运行功能.
The query uses a correlated subquery with a sort and limit to produce a list of ROWID
s that should appear in the final result. Because the correlated subquery is executed for every row, whether or not it's included in the result, it may not be as efficient as the window function version given below - but unlike that version it'll work on SQLite3, which doesn't support window functions.
此查询要求 ROWID
是唯一的(可以用作主键).
This query requires that ROWID
is unique (can be used as a primary key).
我在 PostgreSQL 9.2 和 SQLite3 3.7.11 中测试了上述内容;它在两者中都运行良好.它不适用于 MySQL 5.5 或最新的 5.6 里程碑,因为 MySQL 在与 IN
一起使用的子查询中不支持 LIMIT
.
I tested the above in PostgreSQL 9.2 and in SQLite3 3.7.11 ; it works fine in both. It won't work on MySQL 5.5 or the latest 5.6 milestone because MySQL doesn't support LIMIT
in a subquery used with IN
.
SQLFiddle 演示:
SQLFiddle demos:
PostgreSQL(工作正常):http://sqlfiddle.com/#!12/22829/3
SQLite3(工作正常,查询文本相同,但由于明显的 JDBC 驱动程序限制需要单值插入):http://sqlfiddle.com/#!7/9ecd8/1
SQLite3 (works fine, same query text, but needed single-valued inserts due to apparent JDBC driver limitation): http://sqlfiddle.com/#!7/9ecd8/1
MySQL 5.5(失败有两种方式;即使在 ANSI
模式下,MySQL 也不喜欢 a."ROWID"
引用,所以我不得不取消引用;然后它失败了 这个版本的 MySQL 还不支持 'LIMIT & IN/ALL/ANY/SOME subquery
): http://sqlfiddle.com/#!2/e1f31/2
MySQL 5.5 (fails two ways; MySQL doesn't like a."ROWID"
quoting even in ANSI
mode so I had to un-quote; then it fails with This version of MySQL doesn't yet support 'LIMIT & IN/ALL/ANY/SOME subquery
): http://sqlfiddle.com/#!2/e1f31/2
SQLite 演示显示它在 SQLite3 命令行上运行良好:http://pastebin.com/26n4NiUC
SQLite demo showing it works just fine on the SQLite3 command line: http://pastebin.com/26n4NiUC
输出(PostgreSQL):
Output (PostgreSQL):
ROWID | CID | PID | Score | SortKey
-------+-----+-----+-------+---------
2 | C1 | P2 | 20 | 2
3 | C1 | P3 | 30 | 3
5 | C2 | P5 | 30 | 2
4 | C2 | P4 | 20 | 3
7 | C3 | P7 | 20 | 2
(5 rows)
如果您想过滤特定的 CID
,只需将 AND "CID" = 'C1'
或任何内容添加到 outer WHERE
子句.
If you want to filter for a particular CID
, just add AND "CID" = 'C1'
or whatever to the outer WHERE
clause.
这是一个密切相关的答案,并提供更详细的示例:https://stackoverflow.com/a/13411138/398670
Here's a closely related answer with more detailed examples: https://stackoverflow.com/a/13411138/398670
由于这最初只是标记为 SQL
(没有 SQLite)...只是为了完整性,在 PostgreSQL 或其他支持 SQL 标准窗口函数的数据库中,我可能会这样做:
Since this was originally tagged just SQL
(no SQLite)... just for completeness, in PostgreSQL or other DBs with SQL-standard window function support I'd probably do this:
SELECT "ROWID", "CID", "PID", "Score", "SortKey"
FROM (
SELECT *, row_number() OVER (PARTITION BY "CID" ORDER BY "SortKey") AS n
FROM table1
WHERE "Score" >= 20
) x
WHERE n < 3
ORDER BY "CID", "SortKey";
产生相同的结果.SQLFiddle,包括额外的 C1
行以证明限制过滤器确实有效:http://sqlfiddle.com/#!12/22829/1
which produces the same result. SQLFiddle, including extra C1
row to demonstrate that the limiting filter actually works: http://sqlfiddle.com/#!12/22829/1
如果您想过滤特定的 CID
,只需将 AND "CID" = 'C1'
或任何内容添加到 inner WHERE
子句.
If you want to filter for a particular CID
, just add AND "CID" = 'C1'
or whatever to the inner WHERE
clause.
顺便说一句,你的测试数据是不够的,因为对于任何得分 > 20 的 CID,它永远不会超过两行.
BTW, your test data is insufficient, since it can never have more than two rows for any CID with score > 20 anyway.
相关文章