std::deque 的内存开销是怎么回事?
我正在研究使用 std::queue
的外部排序算法,并且必须仔细限制其内存使用.我注意到在合并阶段(使用几个固定长度的 std::queue
),我的内存使用量增加到我预期的大约 2.5 倍.由于 std::queue
默认使用 std::deque
作为其底层容器,我对 std::deque
进行了一些测试以确定其内存开销.以下是在 VC++ 9 上运行的结果,在发布模式下,使用 64 位进程:
I am working on an external sorting algorithm that uses std::queue
and must carefully constrain its memory usage. I have noticed that during the merge phase (which uses several std::queue
s of fixed length), my memory usage increases to about 2.5X what I expected. Since std::queue
by default uses std::deque
as its underlying container, I ran some tests on std::deque
to determine its memory overhead. Here are the results, running on VC++ 9, in release mode, with a 64-bit process:
当向 std::deque
添加 100,000,000 个 char
时,内存使用量增长到 252,216K.请注意,100M char
s(1 字节)应占用 97,656K,因此这是 154,560K 的开销.
When adding 100,000,000 char
s to a std::deque
, the memory usage grows to 252,216K. Note that 100M char
s (1 byte) should occupy 97,656K, so this is an overhead of 154,560K.
我用 double
s(8 字节)重复了测试,看到内存增长到 1,976,676K,而 100M double
s 应该占用 781,250K,开销为 1,195,426K!!
I repeated the test with double
s (8 bytes) and saw memory grow to 1,976,676K, while 100M double
s should occupy 781,250K, for an overhead of 1,195,426K!!
现在我明白 std::deque
通常被实现为块"的链表.如果这是真的,那么为什么开销与元素大小成正比(因为当然指针大小应该固定为 8 个字节)?为什么它如此危险?
Now I understand that std::deque
is normally implemented as a linked list of "chunks." If this is true, then why is the overhead proportional to the element size (because of course the pointer size should be fixed at 8 bytes)? And why is it so danged huge?
谁能解释一下为什么 std::deque
使用这么多危险内存?我想我应该将我的 std::queue
底层容器切换到 std::vector
因为没有开销(假设适当的大小是 reserve
ed).我认为 std::deque
的好处在很大程度上被这样一个事实否定了,因为它有如此巨大的开销(导致缓存未命中、页面错误等),并且复制成本std::vector
元素可能会更少,因为整体内存使用量要低得多.这只是微软对 std::deque
的错误实现吗?
Can anybody shed some light on why std::deque
uses so much danged memory? I'm thinking I should switch my std::queue
underlying containers to std::vector
as there is no overhead (given that the appropriate size is reserve
ed). I'm thinking the benefits of std::deque
are largely negated by the fact that it has such a huge overhead (resulting in cache misses, page faults, etc.), and that the cost of copying std::vector
elements may be less, given that the overall memory usage is so much lower. Is this just a bad implementation of std::deque
by Microsoft?
推荐答案
看代码_DEQUESIZ(每块的元素数):
Look at the code for _DEQUESIZ (number of elements per block):
#define _DEQUESIZ (sizeof (_Ty) <= 1 ? 16
: sizeof (_Ty) <= 2 ? 8
: sizeof (_Ty) <= 4 ? 4
: sizeof (_Ty) <= 8 ? 2 : 1) /* elements per block (a power of 2) */
如果元素变大,它会变小.只有对于大于 8 字节的元素,您才会获得预期的行为(随着元素大小的增加,开销百分比降低).
It gets smaller if the element is larger. Only for elements larger than 8 bytes will you get the expected behavior (percentual decrease of overhead with increase of element size).
相关文章