std::bitset 的二进制序列化
std::bitset
有一个 to_string()
方法,用于序列化为基于 char
的 1
s 和 0
s.显然,这对 bitset 中的 每个 位使用了一个 8 位 char
,使得序列化表示比需要的时间长 8 倍.
我想以二进制表示形式存储位集以节省空间.to_ulong()
方法仅在我的位集中少于 32 位时才相关.我有数百个.
我不确定我想在对象(地址)本身上使用 memcpy()
/std::copy()
,因为它假定对象是 POD.
std::bitset
has a to_string()
method for serializing as a char
-based string of 1
s and 0
s. Obviously, this uses a single 8 bit char
for each bit in the bitset, making the serialized representation 8 times longer than necessary.
I want to store the bitset in a binary representation to save space. The to_ulong()
method is relevant only when there are less than 32 bits in my bitset. I have hundreds.
I'm not sure I want to use memcpy()
/std::copy()
on the object (address) itself, as that assumes the object is a POD.
API 似乎没有为我可以从中获取地址的内部数组表示提供句柄.
The API does not seem to provide a handle to the internal array representation from which I could have taken the address.
我还想要从二进制表示中反序列化位集的选项.
I would also like the option to deserialize the bitset from the binary representation.
我该怎么做?
推荐答案
这是一种基于显式创建 std::vector
的可能方法一次...
This is a possible approach based on explicit creation of an std::vector<unsigned char>
by reading/writing one bit at a time...
template<size_t N>
std::vector<unsigned char> bitset_to_bytes(const std::bitset<N>& bs)
{
std::vector<unsigned char> result((N + 7) >> 3);
for (int j=0; j<int(N); j++)
result[j>>3] |= (bs[j] << (j & 7));
return result;
}
template<size_t N>
std::bitset<N> bitset_from_bytes(const std::vector<unsigned char>& buf)
{
assert(buf.size() == ((N + 7) >> 3));
std::bitset<N> result;
for (int j=0; j<int(N); j++)
result[j] = ((buf[j>>3] >> (j & 7)) & 1);
return result;
}
注意调用反序列化模板函数bitset_from_bytes
必须在函数调用中指定位集大小N
,例如
Note that to call the de-serialization template function bitset_from_bytes
the bitset size N
must be specified in the function call, for example
std::bitset<N> bs1;
...
std::vector<unsigned char> buffer = bitset_to_bytes(bs1);
...
std::bitset<N> bs2 = bitset_from_bytes<N>(buffer);
如果您真的关心速度,那么一种会有所收获的解决方案是进行循环展开,以便一次完成一个字节的打包,但更好的是编写自己的 bitset 实现,而不是隐藏内部二进制表示,而不是使用 std::bitset
.
If you really care about speed one solution that would gain something would be doing a loop unrolling so that the packing is done for example one byte at a time, but even better is just to write your own bitset implementation that doesn't hide the internal binary representation instead of using std::bitset
.
相关文章