散列任意精度值 (boost::multiprecision::cpp_int)

2021-12-24 00:00:00 hash c++ arbitrary-precision c++11 boost

我需要获得任意精度的值的哈希值(来自 Boost.Multiprecision)；我使用 cpp_int 后端.我想出了以下代码:

I need to get the hash of a value with arbitrary precision (from Boost.Multiprecision); I use the cpp_int backend. I came up with the following code:

boost::multiprecision::cpp_int x0 = 1; const auto seed = std::hash<std::string>{}(x0.str());

我不需要代码尽可能快，但我发现散列字符串表示非常笨拙.

I don't need the code to be as fast as possible, but I find it very clumsy to hash the string representation.

所以我的问题是双重的:

So my question is twofold:

保持任意精度，我可以更有效地散列值吗?
也许我不应该坚持保持任意精度，我应该转换为一个 double ，我可以很容易地散列(但是我仍然会使用任意精度值对哈希表进行所需的比较)?

Keeping the arbitrary precision, can I hash the value more efficiently?

Maybe I should not insisting on keeping the arbitrary precision and I should convert to a double which I could hash easily (I would still however make the comparison needed for the hash table using the arbitrary precision value)?

推荐答案

您可以(ab)使用序列化支持:

You can (ab)use the serialization support:

对序列化的支持有两种形式:类 number、debug_adaptor、logged_adaptor 和 rational_adaptor 具有直通"序列化支持，这需要底层后端可序列化.

Support for serialization comes in two forms: Classes number, debug_adaptor, logged_adaptor and rational_adaptor have "pass through" serialization support which requires the underlying backend to be serializable.

后端cpp_int、cpp_bin_float、cpp_dec_float 和float128 完全支持Boost.Serialization.

Backends cpp_int, cpp_bin_float, cpp_dec_float and float128 have full support for Boost.Serialization.

所以，让我拼凑一些适用于 boost 和 std 无序容器的东西:

So, let me cobble something together that works with boost and std unordered containers:

template <typename Map> void test(Map const& map) { std::cout << " " << __PRETTY_FUNCTION__ << " "; for(auto& p : map) std::cout << p.second << " " << p.first << " "; } int main() { using boost::multiprecision::cpp_int; test(std::unordered_map<cpp_int, std::string> { { cpp_int(1) << 111, "one" }, { cpp_int(2) << 222, "two" }, { cpp_int(3) << 333, "three" }, }); test(boost::unordered_map<cpp_int, std::string> { { cpp_int(1) << 111, "one" }, { cpp_int(2) << 222, "two" }, { cpp_int(3) << 333, "three" }, }); }

让我们将相关的 hash<> 实现转发到我们自己的使用多精度和序列化的 hash_impl 专业化:

Let's forward the relevant hash<> implementations to our own hash_impl specialization that uses Multiprecision and Serialization:

namespace std { template <typename backend> struct hash<boost::multiprecision::number<backend> > : mp_hashing::hash_impl<boost::multiprecision::number<backend> > {}; } namespace boost { template <typename backend> struct hash<multiprecision::number<backend> > : mp_hashing::hash_impl<multiprecision::number<backend> > {}; }

当然，这就引出了一个问题，hash_impl 是如何实现的?

Now, of course, this begs the question, how is hash_impl implemented?

template <typename T> struct hash_impl { size_t operator()(T const& v) const { using namespace boost; size_t seed = 0; { iostreams::stream<hash_sink> os(seed); archive::binary_oarchive oa(os, archive::no_header | archive::no_codecvt); oa << v; } return seed; } };

这看起来很简单.那是因为 Boost 很棒，编写一个 hash_sink 设备用于 Boost Iostreams 只是以下简单的练习:

This looks pretty simple. That's because Boost is awesome, and writing a hash_sink device for use with Boost Iostreams is just the following straightforward exercise:

namespace io = boost::iostreams; struct hash_sink { hash_sink(size_t& seed_ref) : _ptr(&seed_ref) {} typedef char char_type; typedef io::sink_tag category; std::streamsize write(const char* s, std::streamsize n) { boost::hash_combine(*_ptr, boost::hash_range(s, s+n)); return n; } private: size_t* _ptr; };

完整演示:
生活在 Coliru

#include <iostream> #include <iomanip> #include <boost/archive/binary_oarchive.hpp> #include <boost/multiprecision/cpp_int.hpp> #include <boost/multiprecision/cpp_int/serialize.hpp> #include <boost/iostreams/device/back_inserter.hpp> #include <boost/iostreams/stream_buffer.hpp> #include <boost/iostreams/stream.hpp> #include <boost/functional/hash.hpp> namespace mp_hashing { namespace io = boost::iostreams; struct hash_sink { hash_sink(size_t& seed_ref) : _ptr(&seed_ref) {} typedef char char_type; typedef io::sink_tag category; std::streamsize write(const char* s, std::streamsize n) { boost::hash_combine(*_ptr, boost::hash_range(s, s+n)); return n; } private: size_t* _ptr; }; template <typename T> struct hash_impl { size_t operator()(T const& v) const { using namespace boost; size_t seed = 0; { iostreams::stream<hash_sink> os(seed); archive::binary_oarchive oa(os, archive::no_header | archive::no_codecvt); oa << v; } return seed; } }; } #include <unordered_map> #include <boost/unordered_map.hpp> namespace std { template <typename backend> struct hash<boost::multiprecision::number<backend> > : mp_hashing::hash_impl<boost::multiprecision::number<backend> > {}; } namespace boost { template <typename backend> struct hash<multiprecision::number<backend> > : mp_hashing::hash_impl<multiprecision::number<backend> > {}; } template <typename Map> void test(Map const& map) { std::cout << " " << __PRETTY_FUNCTION__ << " "; for(auto& p : map) std::cout << p.second << " " << p.first << " "; } int main() { using boost::multiprecision::cpp_int; test(std::unordered_map<cpp_int, std::string> { { cpp_int(1) << 111, "one" }, { cpp_int(2) << 222, "two" }, { cpp_int(3) << 333, "three" }, }); test(boost::unordered_map<cpp_int, std::string> { { cpp_int(1) << 111, "one" }, { cpp_int(2) << 222, "two" }, { cpp_int(3) << 333, "three" }, }); }

印刷品

void test(const Map&) [with Map = std::unordered_map<boost::multiprecision::number<boost::multiprecision::backends::cpp_int_backend<> >, std::basic_string<char> >] one 2596148429267413814265248164610048 three 52494017394792286184940053450822912768476066341437098474218494553838871980785022157364316248553291776 two 13479973333575319897333507543509815336818572211270286240551805124608 void test(const Map&) [with Map = boost::unordered::unordered_map<boost::multiprecision::number<boost::multiprecision::backends::cpp_int_backend<> >, std::basic_string<char> >] three 52494017394792286184940053450822912768476066341437098474218494553838871980785022157364316248553291776 two 13479973333575319897333507543509815336818572211270286240551805124608 one 2596148429267413814265248164610048

如您所见，Boost 和标准库的 unordered_map 在实现上的差异表现在相同散列的不同排序中.

As you can see, the difference in implementation between Boost's and the standard library's unordered_map show up in the different orderings for identical hashes.

相关文章