如何获取最频繁的值SQL

2021-12-30 00:00:00 sql oracle11g oracle

我有一个表 Orders(id_trip, id_order)、表 Trip(id_hotel, id_bus, id_type_of_trip) 和表 Hotel(id_hotel, name).

I have a table Orders(id_trip, id_order), table Trip(id_hotel, id_bus, id_type_of_trip) and table Hotel(id_hotel, name).

我想获得订单表中最常出现的酒店的名称.

I would like to get name of the most frequent hotel in table Orders.

SELECT hotel.name from Orders JOIN Trip on Orders.id_trip = Trip.id_hotel JOIN hotel on trip.id_hotel = hotel.id_hotel FROM (SELECT hotel.name, rank() over (order by cnt desc) rnk FROM (SELECT hotel.name, count(*) cnt FROM Orders GROUP BY hotel.name)) WHERE rnk = 1;

推荐答案

分布中最常出现的值"是统计学中的一个独特概念，具有技术名称.它被称为分布的模式.Oracle 有 STATS_MODE() 函数.https://docs.oracle.com/cd/B19306_01/server.102/b14200/functions154.htm

The "most frequently occurring value" in a distribution is a distinct concept in statistics, with a technical name. It's called the MODE of the distribution. And Oracle has the STATS_MODE() function for it. https://docs.oracle.com/cd/B19306_01/server.102/b14200/functions154.htm

例如，使用标准SCOTT模式中的EMP表，select stats_mode(deptno) from scott.emp将返回30 - 员工最多的部门数量.(30 是部门名称"或编号，不是该部门的员工人数！)

For example, using the EMP table in the standard SCOTT schema, select stats_mode(deptno) from scott.emp will return 30 - the number of the department with the most employees. (30 is the department "name" or number, it is NOT the number of employees in that department!)

就你而言:

select stats_mode(h.name) from (the rest of your query)

注意:如果两个或多个酒店并列为最频繁"，则 STATS_MODE() 将返回其中之一(非确定性).如果您需要所有绑定值，您将需要一个不同的解决方案 - 一个很好的例子在文档中(上面链接).这是 Oracle 对统计概念的理解和实施中的一个记录缺陷.

Note: if two or more hotels are tied for "most frequent", then STATS_MODE() will return one of them (non-deterministic). If you need all the tied values, you will need a different solution - a good example is in the documentation (linked above). This is a documented flaw in Oracle's understanding and implementation of the statistical concept.

相关文章