用 Mysql 计算中位数

2022-01-07 00:00:00 statistics mysql median

我在计算值列表的中位数时遇到问题,而不是平均值.

I'm having trouble with calculating the median of a list of values, not the average.

我找到了这篇文章使用 MySQL 计算中位数的简单方法

它引用了我不太理解的以下查询.

It has a reference to the following query which I don't understand properly.

SELECT x.val from data x, data y
GROUP BY x.val
HAVING SUM(SIGN(1-SIGN(y.val-x.val))) = (COUNT(*)+1)/2

如果我有一个 time 列并且我想计算中值,xy 列指的是什么?

If I have a time column and I want to calculate the median value, what do the x and y columns refer to?

推荐答案

val 是你的时间列,xy 是两个引用到数据表(可以写data AS x, data AS y).

val is your time column, x and y are two references to the data table (you can write data AS x, data AS y).

为了避免计算两次总和,您可以存储中间结果.

To avoid computing your sums twice, you can store the intermediate results.

CREATE TEMPORARY TABLE average_user_total_time 
      (SELECT SUM(time) AS time_taken 
            FROM scores 
            WHERE created_at >= '2010-10-10' 
                    and created_at <= '2010-11-11' 
            GROUP BY user_id);

然后您可以计算命名表中这些值的中位数.

Then you can compute median over these values which are in a named table.

临时表 不起作用这里.您可以尝试使用具有MEMORY"表类型的常规表.或者只是让您的子查询在您的查询中计算两次中位数的值.除此之外,我没有看到其他解决方案.这并不意味着没有更好的方法,也许其他人会提出一个想法.

Temporary table won't work here. You could try using a regular table with "MEMORY" table type. Or just have your subquery that computes the values for the median twice in your query. Apart from this, I don't see another solution. This doesn't mean there isn't a better way, maybe somebody else will come with an idea.

相关文章