随机插入/删除的综合向量与链表基准
所以我知道 this 问题,以及其他关于 SO 的问题问题,但其中大部分处理数据结构的复杂性(只是复制到这里,理论上链接这个有 O(
So I am aware of this question, and others on SO that deal with issue, but most of those deal with the complexities of the data structures (just to copy here, linked this theoretically has O(
我理解复杂性似乎表明列表会更好,但我更关心现实世界的表现.
I understand the complexities would seem to indicate that a list would be better, but I am more concerned with the real world performance.
注意:这个问题的灵感来自 slides45 和 46 Bjarne Stroustrup 在 Going Native 2012 上的演讲,在那里他谈到了处理器缓存和引用的局部性如何真正帮助向量,但完全(或足够)帮助列表.
Note: This question was inspired by slides 45 and 46 of Bjarne Stroustrup's presentation at Going Native 2012 where he talks about how processor caching and locality of reference really help with vectors, but not at all (or enough) with lists.
问题: 是否有一种使用 CPU 时间而不是挂起时间来测试这一点的好方法,并获得一种随机"插入和删除可以事先完成的元素的体面方法不影响时间?
Question: Is there a good way to test this using CPU time as opposed to wall time, and getting a decent way of "randomly" inserting and deleting elements that can be done beforehand so it does not influence the timings?
作为奖励,如果能够将其应用于两个任意数据结构(例如向量和哈希映射或类似的东西)以在某些硬件上找到真实世界的性能",那就太好了.
As a bonus, it would be nice to be able to apply this to two arbitrary data structures (say vector and hash maps or something like that) to find the "real world performance" on some hardware.
推荐答案
我想如果我要测试这样的东西,我可能会从这个顺序的代码开始:
I guess if I were going to test something like this, I'd probably start with code something on this order:
#include <list>
#include <vector>
#include <algorithm>
#include <deque>
#include <time.h>
#include <iostream>
#include <iterator>
static const int size = 30000;
template <class T>
double insert(T &container) {
srand(1234);
clock_t start = clock();
for (int i=0; i<size; ++i) {
int value = rand();
T::iterator pos = std::lower_bound(container.begin(), container.end(), value);
container.insert(pos, value);
}
// uncomment the following to verify correct insertion (in a small container).
// std::copy(container.begin(), container.end(), std::ostream_iterator<int>(std::cout, " "));
return double(clock()-start)/CLOCKS_PER_SEC;
}
template <class T>
double del(T &container) {
srand(1234);
clock_t start = clock();
for (int i=0; i<size/2; ++i) {
int value = rand();
T::iterator pos = std::lower_bound(container.begin(), container.end(), value);
container.erase(pos);
}
return double(clock()-start)/CLOCKS_PER_SEC;
}
int main() {
std::list<int> l;
std::vector<int> v;
std::deque<int> d;
std::cout << "Insertion time for list: " << insert(l) << "
";
std::cout << "Insertion time for vector: " << insert(v) << "
";
std::cout << "Insertion time for deque: " << insert(d) << "
";
std::cout << "Deletion time for list: " << del(l) << '
';
std::cout << "Deletion time for vector: " << del(v) << '
';
std::cout << "Deletion time for deque: " << del(d) << '
';
return 0;
}
因为它使用clock
,这应该给处理器时间而不是挂墙时间(尽管一些编译器,如MS VC++ 弄错了).它不会尝试测量插入时间,不包括找到插入点的时间,因为 1) 这需要更多的工作 2) 我仍然无法弄清楚它会完成什么.它当然不是 100% 严格,但考虑到我从中看到的差异,如果看到更仔细的测试有显着差异,我会有点惊讶.例如,使用 MS VC++,我得到:
Since it uses clock
, this should give processor time not wall time (though some compilers such as MS VC++ get that wrong). It doesn't try to measure the time for insertion exclusive of time to find the insertion point, since 1) that would take a bit more work and 2) I still can't figure out what it would accomplish. It's certainly not 100% rigorous, but given the disparity I see from it, I'd be a bit surprised to see a significant difference from more careful testing. For example, with MS VC++, I get:
Insertion time for list: 6.598
Insertion time for vector: 1.377
Insertion time for deque: 1.484
Deletion time for list: 6.348
Deletion time for vector: 0.114
Deletion time for deque: 0.82
使用 gcc 我得到:
With gcc I get:
Insertion time for list: 5.272
Insertion time for vector: 0.125
Insertion time for deque: 0.125
Deletion time for list: 4.259
Deletion time for vector: 0.109
Deletion time for deque: 0.109
计算出搜索时间的因素有些重要,因为您必须分别计算每次迭代的时间.您需要比 clock
(通常是)更精确的东西来产生有意义的结果(更多关于订单或读取时钟周期寄存器).如果您认为合适,请随时对此进行修改――正如我上面提到的,我缺乏动力,因为我看不出这是多么明智的做法.
Factoring out the search time would be somewhat non-trivial because you'd have to time each iteration separately. You'd need something more precise than clock
(usually is) to produce meaningful results from that (more on the order or reading a clock cycle register). Feel free to modify for that if you see fit -- as I mentioned above, I lack motivation because I can't see how it's a sensible thing to do.
相关文章