为什么 rand() 的使用被认为是不好的?

2021-12-21 00:00:00 random c++ lcg

尽管通过 srand() 使用种子,但通常不赞成使用 rand().为什么会是这样?有哪些更好的替代方案?

Usage of rand() is usually frowned upon despite using a seed via srand(). Why would that be the case? What better alternatives are available?

推荐答案

这个故事有两个部分.

首先,rand 是一个伪随机数生成器.这意味着它取决于种子.对于给定的种子,它总是给出相同的序列(假设实现相同).这使得它不适合某些安全性非常重要的应用程序.但是这并不特定于rand.这是任何伪随机生成器的问题.并且肯定有很多类别的问题可以接受伪随机生成器.真随机生成器有其自身的问题(效率、实现、熵),因此对于与安全无关的问题,通常使用伪随机生成器.

First, rand is a pseudorandom number generator. This means it depends on a seed. For a given seed it will always give the same sequence (assuming the same implementation). This makes it not suitable for certain applications where security is of a great concern. But this is not specific to rand. It's an issue with any pseudo-random generator. And there are most certainly a lot of classes of problems where a pseudo-random generator is acceptable. A true random generator has its own issues (efficiency, implementation, entropy) so for problems that are not security related most often a pseudo-random generator is used.

所以您分析了您的问题并得出结论,伪随机生成器是解决方案.在这里,我们遇到了 C 随机库(包括 randsrand)的真正麻烦,这些库是特定于它的,并使其过时(又名:您应该从不使用 rand 和 C 随机库的原因).

So you analyzed your problem and you conclude a pseudo-random generator is the solution. And here we arrive to the real troubles with the C random library (which includes rand and srand) that are specific to it and make it obsolete (a.k.a.: the reasons you should never use rand and the C random library).

  • 一个问题是它有全局状态(由srand设置).这使得无法同时使用多个随机引擎.它还使多线程任务变得非常复杂.

  • One issue is that it has a global state (set by srand). This makes it impossible to use multiple random engines at the same time. It also greatly complicates multithreaded tasks.

其中最明显的问题是缺少分发引擎:rand 给你一个区间[0 RAND_MAX].在这个区间是一致的,也就是说这个区间的每个数字出现的概率都是一样的.但大多数情况下,您需要一个特定时间间隔内的随机数.假设 [0, 1017].一个常用的(也很简单的)公式是 rand() % 1018.但问题在于,除非 RAND_MAX1018 的精确倍数,否则您不会得到均匀分布.

The most visible problem of it is that it lacks a distribution engine: rand gives you a number in interval [0 RAND_MAX]. It is uniform in this interval, which means that each number in this interval has the same probability to appear. But most often you need a random number in a specific interval. Let's say [0, 1017]. A commonly (and naive) used formula is rand() % 1018. But the issue with this is that unless RAND_MAX is an exact multiple of 1018 you won't get an uniform distribution.

另一个问题是rand 的实现质量.这里还有其他答案比我更详细地说明了这一点,所以请阅读它们.

Another issue is the Quality of Implementation of rand. There are other answers here detailing this better than I could, so please read them.

在现代 C++ 中,您绝对应该使用 <random> 中的 C++ 库,它带有多个定义良好的随机引擎以及整数和浮点类型的各种分布.

In modern C++ you should definitely use the C++ library from <random> which comes with multiple random well-defined engines and various distributions for integer and floating point types.

相关文章