在应用 LIMIT 之前获取结果计数的最佳方法

在对来自 DB 的数据进行分页时,您需要知道将有多少页面来呈现页面跳转控件.

When paging through data that comes from a DB, you need to know how many pages there will be to render the page jump controls.

目前我通过运行两次查询来做到这一点,一次包含在 count() 中以确定总结果,第二次应用限制以获取我需要的结果当前页面.

Currently I do that by running the query twice, once wrapped in a count() to determine the total results, and a second time with a limit applied to get back just the results I need for the current page.

这似乎效率低下.有没有更好的方法来确定在应用 LIMIT 之前会返回多少个结果?

This seems inefficient. Is there a better way to determine how many results would have been returned before LIMIT was applied?

我正在使用 PHP 和 Postgres.

I am using PHP and Postgres.

推荐答案

纯 SQL

自 2008 年以来情况发生了变化.您可以使用 窗口函数 在一个查询中获得完整计数和有限的结果.2009 年与 PostgreSQL 8.4 一起引入.

Pure SQL

Things have changed since 2008. You can use a window function to get the full count and the limited result in one query. Introduced with PostgreSQL 8.4 in 2009.

SELECT foo
     , count(*) OVER() AS full_count
FROM   bar
WHERE  <some condition>
ORDER  BY <some col>
LIMIT  <pagesize>
OFFSET <offset>;

请注意,这可能比没有总数的情况要昂贵得多.必须对所有行进行计数,并且从匹配索引中仅获取顶行的可能捷径可能不再有用.
与小表或 full_count <= OFFSET + LIMIT 无关紧要.对于更大的 full_count 来说很重要.

Note that this can be considerably more expensive than without the total count. All rows have to be counted, and a possible shortcut taking just the top rows from a matching index may not be helpful any more.
Doesn't matter much with small tables or full_count <= OFFSET + LIMIT. Matters for a substantially bigger full_count.

极端情况:当 OFFSET 至少与基本查询的行数一样多时,不返回任何行.所以你也没有得到 full_count.可能的替代方案:

Corner case: when OFFSET is at least as great as the number of rows from the base query, no row is returned. So you also get no full_count. Possible alternative:

  • 使用 LIMIT/OFFSET 运行查询并获取总行数

(0.CTE 是单独评估和实现的.在 Postgres 12 或更高版本中,规划器可能会在开始工作之前内联那些类似子查询的内容.)不在这里.

( 0. CTEs are evaluated and materialized separately. In Postgres 12 or later the planner may inline those like subqueries before going to work.) Not here.

  1. WHERE 子句(和 JOIN 条件,尽管在您的示例中没有)过滤来自基表的合格行.其余部分基于过滤后的子集.
  1. WHERE clause (and JOIN conditions, though none in your example) filter qualifying rows from the base table(s). The rest is based on the filtered subset.

(2. GROUP BY 和聚合函数会放在这里.)不在这里.

( 2. GROUP BY and aggregate functions would go here.) Not here.

( 3. 其他 SELECT 列表表达式根据分组/聚合列进行评估.) 不在这里.

( 3. Other SELECT list expressions are evaluated, based on grouped / aggregated columns.) Not here.

  1. 窗口函数的应用取决于 OVER 子句和函数的框架规范.简单的 count(*) OVER() 是基于所有符合条件的行.

  1. Window functions are applied depending on the OVER clause and the frame specification of the function. The simple count(*) OVER() is based on all qualifying rows.

排序方式

(6. DISTINCTDISTINCT ON 会放在这里.)不在这里.

( 6. DISTINCT or DISTINCT ON would go here.) Not here.

  1. LIMIT/OFFSET 根据已建立的顺序应用以选择要返回的行.
  1. LIMIT / OFFSET are applied based on the established order to select rows to return.

LIMIT/OFFSET 随着表中行数的增加而变得越来越低效.如果您需要更好的性能,请考虑替代方法:

LIMIT / OFFSET becomes increasingly inefficient with a growing number of rows in the table. Consider alternative approaches if you need better performance:

  • 在大表上使用 OFFSET 优化查询

获取受影响行数的方法完全不同(不是OFFSETLIMIT<之前的完整计数/code> 已应用).Postgres 有内部记账多少行受最后一个 SQL 命令影响.一些客户端可以访问该信息或自己计算行数(如 psql).

There are completely different approaches to get the count of affected rows (not the full count before OFFSET & LIMIT were applied). Postgres has internal bookkeeping how many rows where affected by the last SQL command. Some clients can access that information or count rows themselves (like psql).

例如,您可以在执行 SQL 命令后立即检索 plpgsql 中受影响的行数:

For instance, you can retrieve the number of affected rows in plpgsql immediately after executing an SQL command with:

GET DIAGNOSTICS integer_var = ROW_COUNT;

手册中的详细信息.

或者你可以使用 pg_num_rows 在 PHP 中.或其他客户端中的类似功能.

Or you can use pg_num_rows in PHP. Or similar functions in other clients.

相关:

  • 计算受批处理影响的行数在 PostgreSQL 中查询

相关文章