多线程的随机数

我打算为 Linux 编写一个 C++11 应用程序,它基于大约一百万个伪随机 32 位数字进行一些数值模拟(不是密码学).为了加快速度,我想使用台式机 CPU 的所有内核在并行线程中执行模拟.我想使用 Mersenne Twister mt19937 由 boost 作为 PRNG 提供,我想出于性能原因我应该每个线程有一个这样的 PRNG.现在我不确定如何为它们播种以避免在多个线程中生成相同的随机数子序列.

I intend to write a C++11 application for Linux which does some numerical simulation (not cryptography) based on approximately one million pseudorandom 32bit numbers. To speed things up, I'd like to perform the simulation in parallel threads using all cores of a desktop CPU. I'd like to use the Mersenne Twister mt19937 provided by boost as the PRNG, and I guess that for performance reasons I should have one such PRNG per thread. Now I'm unsure about how to seed them in order to avoid generating the same subsequence of random numbers in multiple threads.

以下是我目前想到的替代方案:

Here are the alternatives that I have thought of so far:

  1. 独立于 /dev/urandom 为每个线程播种 PRNG.

  1. Seed the PRNG for every thread independently from /dev/urandom.

我有点担心系统熵池耗尽的情况,因为我不知道系统内部PRNG是如何运作的.由于 /dev/urandom 使用的是 Mersenne Twister 本身,我会不会意外地获得准确识别 Mersenne Twister 连续状态的连续种子?可能与我对下一点的担忧密切相关.

I'm a bit worried about the case when the system entropy pool gets exhausted, as I don't know how the system internal PRNG operates. Could it happen that I accidentially get consecutive seeds which exactly identify consecutive states of the Mersenne Twister, due to the fact that /dev/urandom is using a Mersenne Twister itself? Probably strongly related to my concerns for the next point.

/dev/urandom 中播种一个 PRNG,从第一个中播种其他.

Seed one PRNG from /dev/urandom and the others from that first one.

基本上也是同样的问题:使用一个 PRNG 来播种另一个使用相同算法的 PRNG 是好是坏?或者换句话说,从 mt19937 读取 625 个 32 位整数是否直接对应于 mt19937 生成器在此生成过程中的任何时刻的内部状态?

Basically the same concern as well: is it good or bad to use one PRNG to seed another that uses the same algorithm? Or in other words, does reading 625 32bit integers from a mt19937 correspond directly to the internal state of the mt19937 generator at any point during this generation?

从一开始就用非梅森信息播种其他人.

Seed others from first with non-Mersenne information.

由于使用相同的算法来生成随机数并生成初始种子,在某种程度上感觉这可能是个坏主意,因此我考虑引入一些不依赖于 Mersenne Twister 算法的元素.例如,我可以将线程 id 异或到初始种子向量的每个元素中.这会让事情变得更好吗?

As using the same algorithm to generate random numbers and to generate the initial seed feels somehow like it might be a bad idea, I thought about introducing some element which is not dependent on the Mersenne Twister algorithm. For example, I could XOR the thread id into each element of the initial seed vector. Does that make things any better?

在线程之间共享一个 PRNG.

Share one PRNG among threads.

这将确保只有一个序列,具有梅森扭曲器的所有已知和理想特性.但是控制对该生成器的访问所需的锁定开销确实让我有些担心.由于我没有发现相反的证据,我假设我作为图书馆用户将负责防止对 PRNG 的并发访问.

This would make sure that there is only one sequence, with all the known and desirable properties of the Mersenne Twister. But the locking overhead required to control access to that generator does worry me somewhat. As I have found no evidence to the contrary, I assume that I as the library user would be responsible for preventing concurrent access to the PRNG.

预先生成所有随机数.

这将使一个线程预先生成所有必需的 1M 随机数,供以后不同的线程使用.与整个应用程序的内存需求相比,4M 的内存需求会很小.这种方法最让我担心的是随机数的生成本身不是并发的.整个方法也不能很好地扩展.

This would have one thread generate all the required 1M random numbers up front, to be used by the different threads later on. The memory requirement of 4M would be small compared to that of the overall application. What worries me most in this approach is that the generation of random numbers itself is not concurrent. This whole approach also doesn't scale too well.

问题

您会建议采用哪种方法,为什么?或者您有什么不同的建议?

Questions

Which of these approaches would you suggest, and why? Or do you have a different suggestion?

你知道我的哪些担忧是合理的,哪些仅仅是因为我对事情的实际运作方式缺乏洞察力?

Do you know which of my concerns are justified and which are simply due to my lack of insight into how things actually work?

推荐答案

我会使用一个实例来播种其他实例.我很确定你可以很容易地安全地做到这一点.

I'd use one instance to seed the others. I'm pretty sure you can do this safely fairly easily.

  • 即使状态空间的微小变化也会导致下游发生相当大的变化――如果你能确保它们没有完全相同的起始空间(并且没有相同的状态前缀),我就不会担心产生相同的数字.例如,仅使用值 1,2,3 来播种三个线程就可以正常工作 - 您甚至不需要播种整个空间.另一个优势:通过使用明显可预测的种子,您可以轻松地否定您正在挑选任何运行的想法(假设您正在尝试展示某些东西).
  • 以某种方式播种是微不足道的,这意味着所产生的子项"是高度不相关的.只需以广度优先的方式进行迭代;即如果你想播种 N x 623 个 int 值,不要顺序播种 623 个值,而是选择第一个 N 并分发,然后是下一个 N 等等.即使播种者和孩子之间存在某种相关性,各种孩子应该几乎不存在 - 这就是您关心的全部.
  • 我更喜欢一种尽可能允许确定性执行的算法,因此依赖 urandom 没有吸引力.这使调试更容易.
  • 最后,很明显 - 测试.这些 PRNG 相当稳健,但无论如何都要注意结果并根据您模拟的内容进行一些相关性测试.大多数问题应该是显而易见的 - 要么你播种不好并且有明显的重复子序列,要么你播种得很好,然后质量取决于 PRNG 限制.
  • 对于最终执行,在完成测试后,您可以使用 urandom 为 623 个状态值中的第一个设置种子,以确保安心和/或线程 ID.
  • Even small changes in the state space cause fairly large changes downstream - if you can ensure they don't have exactly the same starting space (and no identical state prefix), I wouldn't worry about producing identical numbers. For instance, using just the values 1,2,3 to seed three threads would work fine - you don't even need to seed the entire space. Another advantage: by using clearly predictable seeds you can easily discredit the idea that you're cherry-picking any runs (assuming you're trying to demonstrate something).
  • It's trivial to seed in a way that means the resultant "children" are highly un-correlated. Just iterate in a breadth-first fashion; i.e. if you want to seed N x 623 int values, don't seed 623 values sequentially, but pick the first N and distribute, then the next N etc. Even if there's some correlation between the seeder and the children, the correlation between the various children should be virtually non-existant - and that's all you care about.
  • I'd prefer an algorithm that allows deterministic execution whenever possible, so depending on urandom is not attractive. This makes debugging easier.
  • Finally, and obviously - test. These PRNG are fairly robust, but by all means eyeball the results and do a few correlation tests inspired by what you're simulating. Most problems should be obvious - either you've seeded badly and there are obvious repeating subsequences, you you've seeded well, and then the quality is dictated by the PRNG limitations.
  • For final executions, after you're done testing, you can seed the first of 623 state values using urandom for peace of mind and/or the thread ID.

相关文章