c++ Vector,每当它在堆栈上扩展/重新分配时会发生什么?

2021-12-21 00:00:00 stack vector c++ allocator

我是 C++ 的新手,我在我的项目中使用了 vector 类.我发现它非常有用,因为我可以有一个在需要时自动重新分配的数组(即,如果我想 push_back 一个项目并且向量已经达到它的最大容量,它会重新分配自己向操作系统请求更多的内存空间),所以访问向量的元素非常快(它不像列表,要到达第 n"个元素,我必须通过n"个第一个元素).

I'm new to C++ and I'm using the vector class on my project. I found it quite useful because I can have an array that automatically reallocates whenever it is necessary (ie, if I want to push_back an item and the vector has reached it's maximum capacity, it reallocates itself asking more memory space to the OS), so access to an element of the vector is very quick (it's not like a list, that to reach the "n-th" element I must go through the "n" first elements).

我发现这个问题非常有用,因为他们的回答解释了当我想将向量存储在堆/堆栈上时,内存分配器" 是如何工作的:

I found this question very useful, because their answers explained perfectly how the "memory allocator" works when I want to store my vector on the heap/stack:

[1] vector<Type> vect;
[2] vector<Type> *vect = new vector<Type>;
[3] vector<Type*> vect;

然而,一个疑问困扰了我一段时间,我找不到答案:每当我构建一个向量并开始推入很多项时,它都会达到向量已满的时刻,因此要继续增长,它需要重新分配,将自身复制到新位置并然后继续push_back items(显然,这个重新分配隐藏在类的实现中,所以它对我完全透明)

However, a doubt is bugging me for a while, and I can't find its answer: Whenever I construct a vector and begin pushing a lot of items in, it would reach a moment when the vector would be full, so to continue growing it would need to reallocate, copy itself to a new location and then continue pushing_back items (obviously, this reallocation it's hidden on the implementation of the class, so it is completely transparent to me)

好吧,如果我在堆上创建了向量 [2],我可以轻松想象可能发生的事情:类向量调用 malloc,获取新空间,然后将自身复制到新内存中,最后删除旧内存免费拨打.

Fine, if I have created the vector on the heap [2], I have no troubles imagining what may be happening: class vector calls malloc, acquires new space and then copy itself into the new memory and finally deletes the old memory calling free.

然而,当我在堆栈上构造一个向量时,面纱隐藏了正在发生的事情[1]:当向量必须重新分配时会发生什么?AFAIK,每当您在 C/C++ 上输入一个新函数时,计算机都会查看变量的声明,然后扩展堆栈以获得放置这些变量的必要空间,但您无法分配当函数已经运行时,堆栈上有更多空间.类向量是如何解决这个问题的?

However, a veil hides what is happening when I construct a vector on the stack [1]: What does it happens when the vector must reallocate? AFAIK, whenever on C/C++ you enter a new function, the computer would look at the declaration of variables and then expand the stack to get the necessary space to put these variables, but you can't allocate more space on the stack when the function is already running. How does the class vector solve this problem?

推荐答案

你写的

[...] 将自身复制到新位置 [...]

[...] copy itself to a new location [...]

这不是矢量的工作方式.矢量数据将复制到新位置,而不是矢量本身.

which is not the way a vector works. The vector data is copied to a new location, not the vector itself.

我的回答应该让您了解矢量的设计方式.

My answer should give you an idea of how a vector is designed.

注意:std::allocator 实际上可能是一个空类,std::vector 可能不包含此类的实例.对于任意分配器,情况可能并非如此.

Note: The std::allocator is actually likely to be an empty class and std::vector will probably not contain an instance of this class. This may not be true for an arbitrary allocator.

在大多数实现中,它由三个指针组成,其中

In most implementations it consists of three pointers where

  • begin 指向vector在堆上的数据内存的开始(如果不是nullptr,则一直在堆上)
  • end 指向向量数据最后一个元素之后的一个内存位置-> size() == end-begin
  • capacity 指向超过向量内存最后一个元素的内存位置 -> capacity() == capacity-begin
  • begin points to the start of the data memory of the vector on the heap (always on the heap if not nullptr)
  • end points one memory location past the last element of the vector data -> size() == end-begin
  • capacity points on memory location past the last element of the vector memory -> capacity() == capacity-begin

我们声明一个 std::vector 类型的变量,其中 T 是任何类型,A 是分配器类型对于 T(即 std::allocator).

We declare a variable of type std::vector<T,A> where T is any type and A is an allocator type for T (i.e. std::allocator<T>).

std::vector<T, A> vect1;

这在内存中看起来如何?

How does this look like in memory?

如我们所见:堆上什么都没有发生,但变量占用了堆栈上所有成员所需的内存.它就在那里,它会一直留在那里直到 vect1 超出范围,因为 vect1 只是一个对象,就像任何其他 double 类型的对象一样,int 或其他什么.无论它在堆上处理多少内存,它都会坐在它的堆栈位置上并等待被销毁.

As we see: Nothing happens on the heap but the variable occupies the memory that is necessary for all of its members on the stack. There it is and it will stay there until vect1 goes out of scope, since vect1 is just an object like any other object of type double, int or whatever. It will sit there on its stack position and wait to get destroyed, regardless of how much memory it handles itself on the heap.

vect1 的指针不指向任何地方,因为向量是空的.

The pointers of vect1 do not point anywhere, since the vector is empty.

现在我们需要一个指向向量的指针,并使用一些动态堆分配来创建向量.

Now we need a pointer to a vector and use some dynamic heap allocation to create the vector.

std::vector<T, A> * vp = new std::vector<T, A>;

我们再来看看内存.

我们的 vp 变量在堆栈上,我们的向量现在在堆上.同样,向量本身不会在堆上移动,因为它的大小是恒定的.如果发生重新分配,只有指针(beginendcapacity)将移动到内存中的数据位置之后.让我们看看那个.

We have our vp variable on the stack and our vector is on the heap now. Again the vector itself will not move on the heap since its size is constant. Only the pointers (begin, end, capacity) will move to follow the data position in memory if a reallocation takes place. Let's have a look at that.

现在我们可以开始将元素推送到向量.让我们看看vect1.

Now we can start pushing elements to a vector. Let's look at vect1.

T a;
vect1.push_back(a);

变量 vect1 仍然在原来的位置,但堆上的内存被分配为包含 T 的一个元素.

The variable vect1 is still where it has been but memory on the heap was allocated to contain one element of T.

如果我们再添加一个元素会发生什么?

What happens if we add one further element?

vect1.push_back(a);

  • 在堆上为数据元素分配的空间将不够(因为它只有一个内存位置).
  • 将为两个元素分配一个新的内存块
  • 第一个元素将被复制/移动到新存储.
  • 旧内存将被释放.

我们看到:新的内存位置不同了.

We see: The new memory location is different.

为了有更多的了解,让我们看看我们销毁最后一个元素时的情况.

To have additional insight let's look at the situation if we destroy the last element.

vect1.pop_back();

分配的内存不会改变,但最后一个元素将调用其析构函数,并且结束指针向下移动一个位置.

The memory allocated won't change but the last element will have its destructor called and the end pointer moves one position down.

如您所见:capacity() == capacity-begin == 2size() == end-begin == 1

相关文章