如何获取最频繁的值SQL
我有一个表 Orders(id_trip, id_order)、表 Trip(id_hotel, id_bus, id_type_of_trip) 和表 Hotel(id_hotel, name).
I have a table Orders(id_trip, id_order), table Trip(id_hotel, id_bus, id_type_of_trip) and table Hotel(id_hotel, name).
我想获得订单表中最常出现的酒店的名称.
I would like to get name of the most frequent hotel in table Orders.
SELECT hotel.name from Orders
JOIN Trip
on Orders.id_trip = Trip.id_hotel
JOIN hotel
on trip.id_hotel = hotel.id_hotel
FROM (SELECT hotel.name, rank() over (order by cnt desc) rnk
FROM (SELECT hotel.name, count(*) cnt
FROM Orders
GROUP BY hotel.name))
WHERE rnk = 1;
推荐答案
分布中最常出现的值"是统计学中的一个独特概念,具有技术名称.它被称为分布的模式.Oracle 有 STATS_MODE()
函数.https://docs.oracle.com/cd/B19306_01/server.102/b14200/functions154.htm
The "most frequently occurring value" in a distribution is a distinct concept in statistics, with a technical name. It's called the MODE of the distribution. And Oracle has the STATS_MODE()
function for it. https://docs.oracle.com/cd/B19306_01/server.102/b14200/functions154.htm
例如,使用标准SCOTT
模式中的EMP
表,select stats_mode(deptno) from scott.emp
将返回30
- 员工最多的部门数量.(30 是部门名称"或编号,不是该部门的员工人数!)
For example, using the EMP
table in the standard SCOTT
schema, select stats_mode(deptno) from scott.emp
will return 30
- the number of the department with the most employees. (30 is the department "name" or number, it is NOT the number of employees in that department!)
就你而言:
select stats_mode(h.name) from (the rest of your query)
注意:如果两个或多个酒店并列为最频繁",则 STATS_MODE()
将返回其中之一(非确定性).如果您需要所有绑定值,您将需要一个不同的解决方案 - 一个很好的例子在文档中(上面链接).这是 Oracle 对统计概念的理解和实施中的一个记录缺陷.
Note: if two or more hotels are tied for "most frequent", then STATS_MODE()
will return one of them (non-deterministic). If you need all the tied values, you will need a different solution - a good example is in the documentation (linked above). This is a documented flaw in Oracle's understanding and implementation of the statistical concept.
相关文章