没有 Order By 子句的 SQL Select 语句的顺序

2021-12-02 00:00:00 select oracle sql-server sql-order-by

据我所知,根据关系数据库理论,没有 order by 子句的 select 语句应该被认为没有特定的顺序.但实际上在 SQL Server 和 Oracle(我已经在这两个平台上测试过)中,如果我多次从没有 order by 子句的表中查询,我总是以相同的顺序得到结果.这种行为可以依赖吗?谁能帮忙解释一下?

解决方案

不,不能依赖这种行为.顺序由查询规划器决定建立结果集的方式决定.像 select * from foo_table 这样的简单查询可能会按照它们存储在磁盘上的顺序返回,这可能是主键顺序或它们的创建顺序,或者其他一些随机顺序.更复杂的查询,例如 select * from foo where bar <10 可能会以不同列的顺序返回,基于索引读取,或按表顺序,用于表扫描.甚至更复杂的查询,具有多个 where 条件、group by 子句、unions,将按照计划者认为最有效的顺序进行生成.

甚至两个相同的查询之间的顺序也可能发生变化,因为这些查询之间的数据发生了变化.where"子句可能会在一个查询中通过索引扫描得到满足,但稍后的插入可能会降低该条件的选择性,并且规划器可以决定使用表扫描来执行后续查询.

<小时>

为了更好地说明这一点.RDBMS 系统的任务是尽可能高效地完全满足您的要求.这种效率可以采取多种形式,包括最小化 IO(包括到磁盘以及通过网络向您发送数据)、最小化 CPU 并保持其工作集的大小较小(使用需要最少临时存储的方法).

如果没有 ORDER BY 子句,您将不会确切地要求特定的顺序,因此 RDBMS 会以某种顺序(可能)为您提供这些行对应于查询的某些巧合方面,基于 RDBMS 期望最快生成数据的算法.

如果您关心效率而不是顺序,请跳过 ORDER BY 子句.如果您关心顺序而不是效率,请使用 ORDER BY 子句.

因为您实际上关心BOTH,请使用ORDER BY,然后仔细调整您的查询和数据库以使其高效.

As I know, from the relational database theory, a select statement without an order by clause should be considered to have no particular order. But actually in SQL Server and Oracle (I've tested on those 2 platforms), if I query from a table without an order by clause multiple times, I always get the results in the same order. Does this behavior can be relied on? Anyone can help to explain a little?

解决方案

No, that behavior cannot be relied on. The order is determined by the way the query planner has decided to build up the result set. simple queries like select * from foo_table are likely to be returned in the order they are stored on disk, which may be in primary key order or the order they were created, or some other random order. more complex queries, such as select * from foo where bar < 10 may instead be returned in order of a different column, based on an index read, or by the table order, for a table scan. even more elaborate queries, with multipe where conditions, group by clauses, unions, will be in whatever order the planner decides is most efficient to generate.

The order could even change between two identical queries just because of data that has changed between those queries. a "where" clause may be satisfied with an index scan in one query, but later inserts could make that condition less selective, and the planner could decide to perform a subsequent query using a table scan.


To put a finer point on it. RDBMS systems have the mandate to give you exactly what you asked for, as efficiently as possible. That efficiency can take many forms, including minimizing IO (both to disk as well as over the network to send data to you), minimizing CPU and keeping the size of its working set small (using methods that require minimal temporary storage).

without an ORDER BY clause, you will have not asked exactly for a particular order, and so the RDBMS will give you those rows in some order that (maybe) corresponds with some coincidental aspect of the query, based on whichever algorithm the RDBMS expects to produce the data the fastest.

If you care about efficiency, but not order, skip the ORDER BY clause. If you care about the order but not efficiency, use the ORDER BY clause.

Since you actually care about BOTH use ORDER BY and then carefully tune your query and database so that it is efficient.

相关文章