带有未初始化存储的 STL 向量?
我正在编写一个需要将 struct
放在连续存储中的内部循环.我不知道这些 struct
会提前多少.我的问题是 STL 的 vector
将其值初始化为 0,所以无论我做什么,都会产生初始化成本加上设置 struct
成员的成本以他们的价值观.
I'm writing an inner loop that needs to place struct
s in contiguous storage. I don't know how many of these struct
s there will be ahead of time. My problem is that STL's vector
initializes its values to 0, so no matter what I do, I incur the cost of the initialization plus the cost of setting the struct
's members to their values.
有什么方法可以阻止初始化,或者是否有类似 STL 的容器具有可调整大小的连续存储和未初始化的元素?
Is there any way to prevent the initialization, or is there an STL-like container out there with resizeable contiguous storage and uninitialized elements?
(我确定这部分代码需要优化,我确定初始化是一个很大的成本.)
(I'm certain that this part of the code needs to be optimized, and I'm certain that the initialization is a significant cost.)
另外,请参阅下面我的评论以了解初始化发生的时间.
Also, see my comments below for a clarification about when the initialization occurs.
一些代码:
void GetsCalledALot(int* data1, int* data2, int count) {
int mvSize = memberVector.size()
memberVector.resize(mvSize + count); // causes 0-initialization
for (int i = 0; i < count; ++i) {
memberVector[mvSize + i].d1 = data1[i];
memberVector[mvSize + i].d2 = data2[i];
}
}
推荐答案
std::vector
必须以某种方式初始化数组中的值,这意味着必须调用某些构造函数(或复制构造函数).vector
(或任何容器类)的行为是未定义的,如果您要访问数组的未初始化部分,就像它已初始化一样.
std::vector
must initialize the values in the array somehow, which means some constructor (or copy-constructor) must be called. The behavior of vector
(or any container class) is undefined if you were to access the uninitialized section of the array as if it were initialized.
最好的方法是使用reserve()
和push_back()
,这样就使用了复制构造函数,避免了默认构造.
The best way is to use reserve()
and push_back()
, so that the copy-constructor is used, avoiding default-construction.
使用您的示例代码:
struct YourData {
int d1;
int d2;
YourData(int v1, int v2) : d1(v1), d2(v2) {}
};
std::vector<YourData> memberVector;
void GetsCalledALot(int* data1, int* data2, int count) {
int mvSize = memberVector.size();
// Does not initialize the extra elements
memberVector.reserve(mvSize + count);
// Note: consider using std::generate_n or std::copy instead of this loop.
for (int i = 0; i < count; ++i) {
// Copy construct using a temporary.
memberVector.push_back(YourData(data1[i], data2[i]));
}
}
像这样调用 reserve()
(或 resize()
)的唯一问题是你可能会比你需要的更频繁地调用复制构造函数.如果您可以对数组的最终大小做出很好的预测,最好在开始时 reserve()
空间一次.如果你不知道最终的大小,至少平均副本数会最少.
The only problem with calling reserve()
(or resize()
) like this is that you may end up invoking the copy-constructor more often than you need to. If you can make a good prediction as to the final size of the array, it's better to reserve()
the space once at the beginning. If you don't know the final size though, at least the number of copies will be minimal on average.
在当前版本的 C++ 中,内部循环有点低效,因为临时值在堆栈上构造,复制构造到向量内存,最后临时值被销毁.然而,下一版本的 C++ 有一个称为 R 值引用 (T&&
) 的功能,这将有所帮助.
In the current version of C++, the inner loop is a bit inefficient as a temporary value is constructed on the stack, copy-constructed to the vectors memory, and finally the temporary is destroyed. However the next version of C++ has a feature called R-Value references (T&&
) which will help.
std::vector
提供的接口不允许另一个选项,即使用一些类似工厂的类来构造默认值以外的值.下面是一个用 C++ 实现的模式的粗略示例:
The interface supplied by std::vector
does not allow for another option, which is to use some factory-like class to construct values other than the default. Here is a rough example of what this pattern would look like implemented in C++:
template <typename T>
class my_vector_replacement {
// ...
template <typename F>
my_vector::push_back_using_factory(F factory) {
// ... check size of array, and resize if needed.
// Copy construct using placement new,
new(arrayData+end) T(factory())
end += sizeof(T);
}
char* arrayData;
size_t end; // Of initialized data in arrayData
};
// One of many possible implementations
struct MyFactory {
MyFactory(int* p1, int* p2) : d1(p1), d2(p2) {}
YourData operator()() const {
return YourData(*d1,*d2);
}
int* d1;
int* d2;
};
void GetsCalledALot(int* data1, int* data2, int count) {
// ... Still will need the same call to a reserve() type function.
// Note: consider using std::generate_n or std::copy instead of this loop.
for (int i = 0; i < count; ++i) {
// Copy construct using a factory
memberVector.push_back_using_factory(MyFactory(data1+i, data2+i));
}
}
这样做意味着您必须创建自己的向量类.在这种情况下,它也使本来应该是一个简单示例的内容变得复杂.但是有时使用这样的工厂函数可能会更好,例如,如果插入以其他值为条件,那么即使实际上并不需要,您也必须无条件地构造一些昂贵的临时函数.
Doing this does mean you have to create your own vector class. In this case it also complicates what should have been a simple example. But there may be times where using a factory function like this is better, for instance if the insert is conditional on some other value, and you would have to otherwise unconditionally construct some expensive temporary even if it wasn't actually needed.
相关文章