为什么优化会杀死这个功能?
我们最近在大学举办了一场关于多种语言的编程专题讲座.
We recently had a lecture in university about programming specials in several languages.
讲师写下了以下函数:
inline u64 Swap_64(u64 x)
{
u64 tmp;
(*(u32*)&tmp) = Swap_32(*(((u32*)&x)+1));
(*(((u32*)&tmp)+1)) = Swap_32(*(u32*) &x);
return tmp;
}
虽然我完全理解这在可读性方面也是非常糟糕的风格,但他的主要观点是这部分代码在生产代码中运行良好,直到它们启用了高优化级别.然后,代码什么也不做.
While I totally understand that this is also really bad style in terms of readability, his main point was that this part of code worked fine in production code until they enabled a high optimization level. Then, the code would just do nothing.
他说所有对变量 tmp
的赋值都会被编译器优化掉.但为什么会发生这种情况?
He said that all the assignments to the variable tmp
would be optimized out by the compiler. But why would this happen?
我知道在某些情况下需要声明变量 volatile 以便编译器即使他认为它们永远不会被读取或写入也不会触及它们,但我不知道为什么这会发生在这里.
I understand that there are circumstances where variables need to be declared volatile so that the compiler doesn't touch them even if he thinks that they are never read or written but I wouldn't know why this would happen here.
推荐答案
此代码违反了严格的别名规则,这使得通过不同类型的指针访问对象是非法的,尽管通过 *char ** 访问是允许的.允许编译器假设不同类型的指针不指向相同的内存并进行相应的优化.这也意味着代码会调用未定义行为,并且真的可以做任何事情.
This code violates the strict aliasing rules which makes it illegal to access an object through a pointer of a different type, although access through a *char ** is allowed. The compiler is allowed to assume that pointers of different types do not point to the same memory and optimize accordingly. It also means the code invokes undefined behavior and could really do anything.
该主题的最佳参考资料之一是了解严格别名 并且我们可以看到第一个示例与 OP 的代码类似:
One of the best references for this topic is Understanding Strict Aliasing and we can see the first example is in a similar vein to the OP's code:
uint32_t swap_words( uint32_t arg )
{
uint16_t* const sp = (uint16_t*)&arg;
uint16_t hi = sp[0];
uint16_t lo = sp[1];
sp[1] = hi;
sp[0] = lo;
return (arg);
}
这篇文章解释了这段代码违反了严格的别名规则,因为 sp
是 arg
的别名,但它们有不同的类型,并说虽然它会编译,很可能 arg
在 swap_words
返回后不会改变.尽管通过简单的测试,我无法使用上面的代码或 OP 代码重现该结果,但这并不意味着什么,因为这是未定义行为,因此无法预测.
The article explains this code violates strict aliasing rules since sp
is an alias of arg
but they have different types and says that although it will compile, it is likely arg
will be unchanged after swap_words
returns. Although with simple tests, I am unable to reproduce that result with either the code above nor the OPs code but that does not mean anything since this is undefined behavior and therefore not predictable.
这篇文章继续讨论了许多不同的情况,并提出了几种可行的解决方案,包括通过联合进行类型双关,该联合在 C99 中有明确定义1 并且在 C++ 中可能未定义,但实际上大多数主要编译器都支持,例如这里是 gcc 关于类型双关的参考.上一个主题 C 和 C++ 中联合的目的细节.尽管有很多关于这个主题的话题,但这似乎是最好的工作.
The article goes on to talk about many different cases and presents several working solution including type-punning through a union, which is well-defined in C991 and may be undefined in C++ but in practice is supported by most major compilers, for example here is gcc's reference on type-punning. The previous thread Purpose of Unions in C and C++ goes into the gory details. Although there are many threads on this topic, this seems to do the best job.
该解决方案的代码如下:
The code for that solution is as follows:
typedef union
{
uint32_t u32;
uint16_t u16[2];
} U32;
uint32_t swap_words( uint32_t arg )
{
U32 in;
uint16_t lo;
uint16_t hi;
in.u32 = arg;
hi = in.u16[0];
lo = in.u16[1];
in.u16[0] = lo;
in.u16[1] = hi;
return (in.u32);
}
参考 C99 草案标准中的相关部分 严格别名 是 6.5
Expressions 段落 7 说:
For reference the relevant section from the C99 draft standard on strict aliasing is 6.5
Expressions paragraph 7 which says:
对象只能通过具有以下类型之一的左值表达式访问其存储值:76)
An object shall have its stored value accessed only by an lvalue expression that has one of the following types:76)
――与对象有效类型兼容的类型,
― a type compatible with the effective type of the object,
――与对象有效类型兼容的类型的限定版本,
― a qualified version of a type compatible with the effective type of the object,
――一种类型,它是对应于有效类型的有符号或无符号类型对象,
― a type that is the signed or unsigned type corresponding to the effective type of the object,
――一种类型,它是对应于限定版本的有符号或无符号类型对象的有效类型,
― a type that is the signed or unsigned type corresponding to a qualified version of the effective type of the object,
――一个聚合或联合类型,其中包括上述类型之一成员(包括递归地,子聚合或包含联合的成员),或
― an aggregate or union type that includes one of the aforementioned types among its members (including, recursively, a member of a subaggregate or contained union), or
――一种字符类型.
和脚注 76 说:
此列表的目的是指定对象可以或不可以别名的情况.
The intent of this list is to specify those circumstances in which an object may or may not be aliased.
以及来自 C++ 草案的相关部分标准是3.10
Lvalues and rvalues 段落10
文章类型双关和严格别名 对该主题和C99 revisited 对 C99 和混叠进行了深入分析,不是轻读.这个对访问非活动联合成员 - 未定义?的回答通过中的联合详细介绍了类型双关的细节C++ 也不是轻读.
The article Type-punning and strict-aliasing gives a gentler but less complete introduction to the topic and C99 revisited gives a deep analysis of C99 and aliasing and is not light reading. This answer to Accessing inactive union member - undefined? goes over the muddy details of type-punning through a union in C++ and is not light reading either.
脚注:
- 引用 Pascal Cuoq 的评论:[...] C99 最初措辞笨拙,似乎通过未定义的联合进行类型双关.实际上,虽然联合的类型双关在 C89 中是合法的,在 C11 中是合法的,而且在 C99 中一直是合法的,尽管直到 2004 年委员会才修复错误的措辞,以及随后的 TC3 发布.open-std.org/jtc1/sc22/wg14/www/docs/dr_283.htm
相关文章