编译器重新排序结构
假设我有一个这样的结构:
struct MyStruct{uint8_t var0;uint32_t var1;uint8_t var2;uint8_t var3;uint8_t var4;};
这可能会浪费大量(而不是一吨)空间.这是因为 uint32_t
变量的必要对齐.
实际上(在对齐结构以便它可以实际使用 uint32_t
变量之后)它可能看起来像这样:
struct MyStruct{uint8_t var0;uint8_t 未使用[3];//浪费了3个字节的空间uint32_t var1;uint8_t var2;uint8_t var3;uint8_t var4;};
更有效的结构是:
struct MyStruct{uint8_t var0;uint8_t var2;uint8_t var3;uint8_t var4;uint32_t var1;};
现在的问题是:
为什么编译器(根据标准)禁止对结构重新排序?
如果重新排序结构体,我看不出有什么办法可以让你自己在脚下开枪.
解决方案为什么编译器(根据标准)禁止对结构重新排序?
根本原因是:为了兼容C.
请记住,C 最初是一种高级汇编语言.在 C 中通过将字节重新解释为特定的 struct
来查看内存(网络数据包,...)是很常见的.
这导致多个功能依赖此属性:
C 保证
struct
的地址和它的第一个数据成员的地址是相同的,所以 C++ 也这样做(在没有virtual
代码>继承/方法).C 保证如果你有两个
struct
A
和B
并且都以数据成员char
后跟一个数据成员int
(以及之后的任何内容),然后当您将它们放入union
时,您可以编写B
成员并通过其A
成员读取char
和int
,因此 C++ 也这样做:标准布局.
后者是极其广泛的,并且完全防止对大多数struct
(或class
)的数据成员进行任何重新排序.><小时>
请注意,该标准确实允许进行一些重新排序:由于 C 没有访问控制的概念,因此 C++ 指定未指定具有不同访问控制说明符的两个数据成员的相对顺序.
据我所知,没有编译器试图利用它;但理论上他们可以.
在 C++ 之外,诸如 Rust 之类的语言允许编译器对字段进行重新排序,而主 Rust 编译器 (rustc) 在默认情况下会这样做.只有历史决定和对向后兼容性的强烈渴望才能阻止 C++ 这样做.
Suppose I have a struct like this:
struct MyStruct
{
uint8_t var0;
uint32_t var1;
uint8_t var2;
uint8_t var3;
uint8_t var4;
};
This is possibly going to waste a bunch (well not a ton) of space. This is because of necessary alignment of the uint32_t
variable.
In actuality (after aligning the structure so that it can actually use the uint32_t
variable) it might look something like this:
struct MyStruct
{
uint8_t var0;
uint8_t unused[3]; //3 bytes of wasted space
uint32_t var1;
uint8_t var2;
uint8_t var3;
uint8_t var4;
};
A more efficient struct would be:
struct MyStruct
{
uint8_t var0;
uint8_t var2;
uint8_t var3;
uint8_t var4;
uint32_t var1;
};
Now, the question is:
Why is the compiler forbidden (by the standard) from reordering the struct?
I don't see any way you could shoot your self in the foot if the struct was reordered.
解决方案Why is the compiler forbidden (by the standard) from reordering the struct?
The basic reason is: for compatibility with C.
Remember that C is, originally, a high-level assembly language. It is quite common in C to view memory (network packets, ...) by reinterpreting the bytes as a specific struct
.
This has led to multiple features relying on this property:
C guaranteed that the address of a
struct
and the address of its first data member are one and the same, so C++ does too (in the absence ofvirtual
inheritance/methods).C guaranteed that if you have two
struct
A
andB
and both start with a data memberchar
followed by a data memberint
(and whatever after), then when you put them in aunion
you can write theB
member and read thechar
andint
through itsA
member, so C++ does too: Standard Layout.
The latter is extremely broad, and completely prevents any re-ordering of data members for most struct
(or class
).
Note that the Standard does allow some re-ordering: since C did not have the concept of access control, C++ specifies that the relative order of two data members with a different access control specifier is unspecified.
As far as I know, no compiler attempts to take advantage of it; but they could in theory.
Outside of C++, languages such as Rust allow compilers to re-order fields and the main Rust compiler (rustc) does so by default. Only historical decisions and a strong desire for backward compatibility prevent C++ from doing so.
相关文章