什么是数据对齐?在 C 中对指针进行类型转换时,为什么以及何时应该担心?

2021-12-21 00:00:00 memory c c++

我找不到一个像样的文档来解释对齐系统的工作原理以及为什么某些类型比其他类型更严格地对齐.

I couldn't find a decent document that explains how the alignment system works and why some types are more strictly aligned than the others.

推荐答案

我会尽量简短地解释.

您计算机中的架构由处理器和内存组成.内存是按单元组织的,所以:

The architecture in you computer is composed of processor and memory. Memory is organized in cells, so:

 0x00 |   data  |  
 0x01 |   ...   |
 0x02 |   ...   |

每个存储单元都有指定的大小,它可以存储的位数.这取决于架构.

Each memory cell has a specified size, amount of bits it can store. This is architecture dependent.

当您在 C/C++ 程序中定义一个变量时,您的程序会占用一个或多个不同的单元格.

When you define a variable in your C/C++ program, one or more different cells are occupied by your program.

例如

int variable = 12;

假设每个单元格包含 32 位并且 int 类型大小为 32 位,然后在您的内存中的某个位置:

Suppose each cell contains 32 bits and the int type size is 32 bits, then in somewhere in your memory:

variable: | 0 0 0 c |  // c is hexadecimal of 12.

当您的 CPU 必须对该变量进行操作时,它需要将其放入其寄存器中.CPU 可以接收1 个时钟"内存中的少量位,通常称为WORD.这个维度也依赖于架构.

When your CPU has to operate on that variable it needs to bring it inside its register. A CPU can take in "1 clock" a small amount of bit from the memory, that size is usually called WORD. This dimension is architecture dependent as well.

现在假设您有一个变量,由于某些偏移,该变量存储在两个单元格中.

Now suppose you have a variable which is stored, because of some offset, in two cells.

例如,我有两个不同的数据要存储(我将使用字符串表示形式以使其更清晰"):

For example I have two different pieces data to store (I'm going to use a "string representation to make more clear"):

data1: "ab"
data2: "cdef"

因此内存将以这种方式组成(2个不同的单元格):

So the memory will be composed in that way (2 different cells):

|a b c d|     |e f 0 0|

也就是说,data1 只占据了单元格的一半,所以 data2 占据了剩余的部分和第二个单元格的一部分.

That is, data1 occupies just half of the cell, so data2 occupies the remaining part and a part of a second cell.

现在假设您的 CPU 想要读取 data2.CPU 需要 2 个时钟来访问数据,因为在一个时钟内它读取第一个单元格,而在另一个时钟内它读取第二个单元格中的剩余部分.

Now suppose you CPU wants to read data2. The CPU needs 2 clocks in order to access the data, because within one clock it reads the first cell and within the other clock it reads the remaining part in the second cell.

如果我们根据这个内存示例align data2,我们可以引入一种 填充 并将 data2 全部转移到第二个单元格中.

If we align data2 in accordance with this memory-example, we can introduce a sort of padding and shift data2 all in the second cell.

|a b 0 0|     |c d e f|
     ---
   padding

这样,CPU 将只损失1 个时钟";为了访问data2.

In that way the CPU will lose only "1 clock" in order to access to data2.

一个对齐系统只是引入了padding,以便将数据与系统的内存对齐,按照架构记住.当数据在内存中对齐时,您不会浪费 CPU 周期来访问数据.

An align system just introduces that padding in order to align the data with the memory of the system, remember in accordance with the architecture. When the data is aligned in the memory you don't waste CPU cycles in order to access the data.

这样做是出于性能原因(99% 的情况下).

This is done for performance reasons (99% of times).

相关文章