ORDER BY RAND() 似乎小于随机

2022-01-07 00:00:00 statistics random sql mysql

我有一个相当简单的 SQL (MySQL):

I have a fairly simple SQL (MySQL):

SELECT foo FROM bar ORDER BY rank, RAND()

我注意到当我刷新结果时,随机性很弱.

I notice that when I refresh the results, the randomness is suspiciously weak.

在此时的样本数据中,有 6 个具有相同秩的结果(整数零).有很多随机性测试,但这里有一个简单的手工测试:当运行两次时,两次运行的第一个结果应该是相同的,大约六分之一的时间.这当然不会发生,领先的结果至少有三分之一是相同的.

In the sample data at the moment there are six results with equal rank (integer zero). There are lots of tests for randomness but here is a simple one to do by hand: when run twice, the first result should be the same in both runs about one sixth of the time. This is certainly not happening, the leading result is the same at least a third of the time.

我想要排列的均匀分布.我不是专业的统计学家,但我很确定 ORDER BY RAND() 应该能做到这一点.我错过了什么?

I want a uniform distribution over the permutations. I'm not an expert statistician but I'm pretty sure ORDER BY RAND() should achieve this. What am I missing?

对于 MySQL,SELECT rand(), rand() 显示两个不同的数字,所以我不买每次查询一次"的解释

With MySQL, SELECT rand(), rand() shows two different numbers, so I don't buy the "once per query" explanation

推荐答案

RAND() 每次查询只执行一次. 您可以通过查看结果集来验证这一点.

RAND() is only executed once per query. You can verify this by looking at the result set.

如果您想获得随机订单,您应该使用 NEWID()CHECKSUM(NEWID()).

If you're trying to get a randomized order, you should be using either NEWID() or CHECKSUM(NEWID()).

WITH T AS ( -- example using RAND()
  SELECT 'Me' Name UNION SELECT 'You' UNION SELECT 'Another'
)
SELECT Name, RAND()
FROM T;

WITH T AS ( -- example using just NEWID()
  SELECT 'Me' Name UNION SELECT 'You' UNION SELECT 'Another'
)
SELECT Name, NEWID()
FROM T;

WITH T AS ( -- example getting the CHECKSUM() of NEWID()
  SELECT 'Me' Name UNION SELECT 'You' UNION SELECT 'Another'
)
SELECT Name, CHECKSUM(NEWID())
FROM T;

相关文章