为什么不直接使用random_device?

2021-12-21 00:00:00 random c++ c++11

我对 c++11 随机库有点困惑.

我的理解:我们需要两个独立的概念:

随机引擎(可以是伪(需要种子)或真实的)
分布:它使用特定分布将从引擎获得的数字映射到特定区间.

我不明白为什么不直接使用这个:

std::random_device rd;std::uniform_int_distribution分布(1, 5)；//获取随机数:区(rd)；

据我所知这很有效.

相反，这是我在大多数示例/网站/文章中发现的:

std::random_device rd;std::mt19937 e{rd()};//或 std::default_random_engine e{rd()};std::uniform_int_distribution距离{1, 5};//获取随机数:分布(e)；

我不是在谈论特殊用途，例如密码学，只是您的基本入门文章.

我的怀疑是因为 std::mt19937(或 std::default_random_engine)接受种子，通过在调试期间提供相同的种子可以更容易地调试会话.

另外，为什么不只是:

std::mt19937 e{std::random_device{}()};

解决方案

另外，为什么不只是:
std::mt19937 e{std::random_device{}()};

如果你只做一次可能没问题，但如果你会做很多次，最好跟踪你的 std::random_device 并且不要不必要地创建/销毁它.

查看 std::random_device 实现的 libc++ 源代码可能会有所帮助，它非常简单.它只是对 std::fopen("/dev/urandom") 的一个薄包装.因此，每次创建 std::random_device 时，您都会获得另一个文件系统句柄，并支付所有相关费用.

据我所知，在 Windows 上，std::random_device 代表对微软加密 API 的一些调用，因此每次执行此操作时，您都将初始化和销毁??一些加密库接口.>

这取决于您的应用程序，但出于一般目的，我不会认为这种开销总是可以忽略不计.有时是这样，然后这很棒.

我想这与您的第一个问题有关:

<块引用>

相反，这是我在大多数示例/网站/文章中发现的:

 std::random_device rd;std::mt19937 e{rd()};//或 std::default_random_engine e{rd()};std::uniform_int_distribution距离{1, 5};

至少我是这么认为的:

std::mt19937 是一个非常简单可靠的随机生成器.它是独立的，将完全存在于您的进程中，不会调用操作系统或其他任何东西.该实现由标准强制，至少在 boost 中，它在任何地方都使用相同的代码，源自原始的 mt19937 论文.这段代码非常稳定，而且是跨平台的.您可以非常确信，初始化、查询等将在您编译它的任何平台上编译为类似的代码，并且您将获得类似的性能.
std::random_device 相比之下，它是相当不透明的.你并不真正知道它是什么，它会做什么，或者它的效率如何.你甚至不知道它是否真的可以被获取――当你尝试创建它时它可能会抛出异常.你知道它不需要种子.您通常不应该从中提取大量数据，只需使用它来生成种子即可.有时，它充当加密 API 的一个很好的接口，但实际上并不需要这样做，遗憾的是有时也不需要.它可能对应于 unix 上的 /dev/random，也可能对应于 /dev/urandom/.它可能对应于某些 MSVC 加密 API(visual studio)，或者它可能只是一个固定常量(mingw).如果您为某些手机进行交叉编译，谁知道它会做什么.(即使您确实获得了 /dev/random，您仍然会遇到性能可能不一致的问题――它可能看起来工作得很好，直到熵池用完，然后像狗一样慢.)

我的想法是，std::random_device 应该是 time(NULL) 播种的改进版本――这是一个低标准，因为 time(NULL) 是一个非常糟糕的种子.我通常在我会使用 time(NULL) 生成种子的地方使用它，回到当天.除此之外，我真的不认为它有什么用处.

I am a bit confused about the c++11 random library.

What I understand: we need two separate concepts:

random engine (which can be pseudo (need seed) or real)
distribution: it maps the numbers obtained from the engine to a specific interval, using a specific distribution.

What I don't understand is why not just use this:

std::random_device rd;
std::uniform_int_distribution<int> dist(1, 5);

// get random numbers with:
dist(rd);

As far as I can tell this works well.

Instead, this is what I found on most examples/sites/articles:

std::random_device rd;
std::mt19937 e{rd()}; // or std::default_random_engine e{rd()};
std::uniform_int_distribution<int> dist{1, 5};

// get random numbers with:
dist(e);

I am not talking about special use, e.g. cryptography, just your basic getting started articles.

My suspicion is because std::mt19937 (or std::default_random_engine) accepts a seed, it can be easier to debug by providing the same seed during a debug session.

Also, why not just:

std::mt19937 e{std::random_device{}()};

解决方案

Also, why not just:

std::mt19937 e{std::random_device{}()};

It might be fine if you only will do this once, but if you will do it many times, it's better to keep track of your std::random_device and not create / destroy it unnecessarily.

It may be helpful to look at the libc++ source code for implementation of std::random_device, which is quite simple. It's just a thin wrapper over std::fopen("/dev/urandom"). So each time you create a std::random_device you are getting another filesystem handle, and pay all associated costs.

On windows, as I understand, std::random_device represents some call to a microsoft crypto API, so you are going to be initializing and destroying some crypto library interface everytime you do this.

It depends on your application, but for general purposes I wouldn't think of this overhead as always negligible. Sometimes it is, and then this is great.

I guess this ties back into your first question:

Instead, this is what I found on most examples/sites/articles:

 std::random_device rd;
 std::mt19937 e{rd()}; // or std::default_random_engine e{rd()};
 std::uniform_int_distribution<int> dist{1, 5};

At least the way I think about it is:

std::mt19937 is a very simple and reliable random generator. It is self-contained and will live entirely in your process, not calling out to the OS or anything else. The implementation is mandated by the standard, and at least in boost, it used the same code everywhere, derived from the original mt19937 paper. This code is very stable and it's cross-platform. You can be pretty confident that initializing it, querying from it, etc. is going to compile to similar code on any platform that you compile it on, and that you will get similar performance.
std::random_device by contrast is pretty opaque. You don't really know exactly what it is, what it's going to do, or how efficient it will be. You don't even know if it can actually be acquired -- it might throw an exception when you attempt to create it. You know that it doesn't require a seed. You're not usually supposed to pull tons and tons of data from it, just use it to generate seeds. Sometimes, it acts as a nice interface to cryptographic APIs, but it's not actually required to do that and sadly sometimes it doesn't. It might correspond to /dev/random on unix, it might correspond to /dev/urandom/. It might correspond to some MSVC crypto API (visual studio), or it might just be a fixed constant (mingw). If you cross-compile for some phone, who knows what it will do. (And even when you do get /dev/random, you still have the problem that performance may not be consistent -- it may appear to work great, until the entropy pool runs out, and then it runs slow as a dog.)

The way I think about it is, std::random_device is supposed to be like an improved version of seeding with time(NULL) -- that's a low bar, because time(NULL) is a pretty crappy seed all things considered. I usually use it where I would have used time(NULL) to generate a seed, back in the day. I don't really consider it all that useful outside of that.

相关文章