mySQL 和 postgreSQL 中的 Group by 子句,为什么 postgreSQL 会出错?

2021-11-20 00:00:00 postgresql sql group-by mysql

假设我有这个表:named = the_table其结构为:

postgreSQL:

 create table the_table (col3 SERIAL, col2 varchar, col1 varchar, PRIMARY KEY(col3));

MySQL:

create table the_table ( col3 INT NOT NULL AUTO_INCREMENT PRIMARY KEY, col2 varchar(20), col1 varchar(20) )

然后我插入了表格:

INSERT INTO the_table (col2,col1) VALUES('x','a'),('x','b'),('y','c'),('y','d'),('z','e'),('z','f');

现在表格如下所示:

col3 |col2 |第 1 列------+------+------1 |× |一种2 |× |乙3 |是 |C4 |是 |d5 || |电子6 || |F

当我做这个查询时:

select * from the_table group by col2

然后在mysql中我得到:

1 x a3 年5泽

在 postgreSQL 中,我收到错误:

错误:列the_table.col3"必须出现在 GROUP BY 子句中或用于聚合函数中LINE 1: select * from the_table group by col2;

我的问题:

这个错误是什么意思?什么是聚合函数?

当它在 MySQL 中工作时,为什么它不能在 postgreSQL 中工作?

解决方案

您需要使用 聚合函数:

<块引用>

聚合函数从一组输入计算单个结果值.

SELECT col2, MIN(col3) AS col3, MIN(col1) AS col1FROM the_table按 col2 分组;

db<>小提琴演示<小时><块引用>

MySQL 处理 GROUP BY:

在标准 SQL 中,包含 GROUP BY 子句的查询不能引用到选择列表中未命名的非聚合列GROUP BY 子句

和:

<块引用>

MySQL 扩展了 GROUP BY 的使用,以便选择列表可以引用未在 GROUP BY 子句中命名的非聚合列.这意味着前面的查询在 MySQL 中是合法的.您可以使用此功能通过避免不必要的列排序和分组来获得更好的性能.但是,这主要在未在 GROUP BY 中命名的每个非聚合列中的所有值对于每个组都相同时很有用.服务器可以自由地从每个组中选择任何值,因此除非它们相同,否则选择的值是不确定的

因此,对于没有显式聚合函数的 MySQL 版本,您最终可能会得到不确定的值.我强烈建议使用特定的聚合函数.

<小时>

来自 MySQL 对 GROUP BY 的处理:

<块引用>

SQL92 及更早版本不允许查询的选择列表、HAVING 条件或 ORDER BY 列表引用未在 GROUP BY 子句中命名的非聚合列.

SQL99 和更高版本允许每个可选功能 T301 的此类非聚合如果它们在功能上依赖于 GROUP BY 列:如果在 name 和 custid 之间存在这种关系,则查询是合法的.例如,如果是客户的主键,就会出现这种情况.

示例:

SELECT o.custid, c.name, MAX(o.payment)FROM 订单 AS o加入客户作为 cON o.custid = c.custidGROUP BY o.custid;

Suppose I have this table: named = the_table whose structure is:

postgreSQL:

 create table the_table (col3 SERIAL, col2 varchar, col1 varchar, PRIMARY KEY(col3));

MySQL:

create table the_table ( col3 INT NOT NULL AUTO_INCREMENT PRIMARY KEY, col2 varchar(20), col1 varchar(20) )

Then I inserted the table:

INSERT INTO the_table (col2,col1) VALUES 
('x','a'),
('x','b'),
('y','c'),
('y','d'),
('z','e'),
('z','f');

Now the table looks like this:

col3 | col2 | col1 
------+------+------
    1 | x    | a
    2 | x    | b
    3 | y    | c
    4 | y    | d
    5 | z    | e
    6 | z    | f

When I do this query:

select * from the_table group by col2

then in mysql I get:

1 x a
3 y c
5 z e

and in postgreSQL, I am getting error:

ERROR:  column "the_table.col3" must appear in the GROUP BY clause or be used in an aggregate function
LINE 1: select * from the_table group by col2;

My Questions:

What does this error mean? What is aggregate function ?

When it works in MySQL , why can't it work in postgreSQL ?

解决方案

You need to use AGGREGATE FUNCTION:

Aggregate functions compute a single result from a set of input values.

SELECT col2, MIN(col3) AS col3, MIN(col1) AS col1
FROM the_table 
GROUP BY col2;

db<>fiddle demo


MySQL Handling of GROUP BY:

In standard SQL, a query that includes a GROUP BY clause cannot refer to nonaggregated columns in the select list that are not named in the GROUP BY clause

and:

MySQL extends the use of GROUP BY so that the select list can refer to nonaggregated columns not named in the GROUP BY clause. This means that the preceding query is legal in MySQL. You can use this feature to get better performance by avoiding unnecessary column sorting and grouping. However, this is useful primarily when all values in each nonaggregated column not named in the GROUP BY are the same for each group. The server is free to choose any value from each group, so unless they are the same, the values chosen are indeterminate

So with MySQL version without explicit aggregate function you may end up with undetermininistic values. I strongly suggest to use specific aggregate function.


EDIT:

From MySQL Handling of GROUP BY:

SQL92 and earlier does not permit queries for which the select list, HAVING condition, or ORDER BY list refer to nonaggregated columns that are not named in the GROUP BY clause.

SQL99 and later permits such nonaggregates per optional feature T301 if they are functionally dependent on GROUP BY columns: If such a relationship exists between name and custid, the query is legal. This would be the case, for example, were custid a primary key of customers.

Example:

SELECT o.custid, c.name, MAX(o.payment)
FROM orders AS o
JOIN customers AS c
  ON o.custid = c.custid
GROUP BY o.custid;

相关文章