代码中未调用的函数在运行时被调用
下面的程序怎么会调用 format_disk
如果它从来没有在代码中调用?
How can the following program be calling format_disk
if it's never
called in code?
#include <cstdio>
static void format_disk()
{
std::puts("formatting hard disk drive!");
}
static void (*foo)() = nullptr;
void never_called()
{
foo = format_disk;
}
int main()
{
foo();
}
这因编译器而异.用 Clang 编译优化,函数 never_call
在运行时执行.
This differs from compiler to compiler. Compiling with Clang with
optimizations on, the function never_called
executes at runtime.
$ clang++ -std=c++17 -O3 a.cpp && ./a.out
formatting hard disk drive!
但是,使用 GCC 编译时,此代码会崩溃:
Compiling with GCC, however, this code just crashes:
$ g++ -std=c++17 -O3 a.cpp && ./a.out
Segmentation fault (core dumped)
编译器版本:
$ clang --version
clang version 5.0.0 (tags/RELEASE_500/final)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /usr/bin
$ gcc --version
gcc (GCC) 7.2.1 20171128
Copyright (C) 2017 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
推荐答案
程序包含未定义的行为,如取消引用空指针(即在 main 中调用 foo()
而不为其分配有效地址之前)是UB,因此标准没有要求.
The program contains undefined behavior, as dereferencing a null pointer
(i.e. calling foo()
in main without assigning a valid address to it
beforehand) is UB, therefore no requirements are imposed by the standard.
在运行时执行 format_disk
是一个完美的有效情况未定义的行为已被击中,它与崩溃一样有效(例如用 GCC 编译时).好的,但是为什么 Clang 会这样做呢?如果你在关闭优化的情况下编译它,程序将不再输出格式化硬盘驱动器",并且会崩溃:
Executing format_disk
at runtime is a perfect valid situation when
undefined behavior has been hit, it's as valid as just crashing (like
when compiled with GCC). Okay, but why is Clang doing that? If you
compile it with optimizations off, the program will no longer output
"formatting hard disk drive", and will just crash:
$ clang++ -std=c++17 -O0 a.cpp && ./a.out
Segmentation fault (core dumped)
该版本生成的代码如下:
The generated code for this version is as follows:
main: # @main
push rbp
mov rbp, rsp
call qword ptr [foo]
xor eax, eax
pop rbp
ret
它尝试调用 foo
指向的函数,并作为 foo
用 nullptr
初始化(或者如果它没有任何初始化,这仍然是这种情况),它的值为零.这里,未定义行为受到打击,所以任何事情都可能发生,程序变得无用.通常,调用此类无效地址导致分段错误错误,因此我们得到的消息是执行程序.
It tries to make a call to a function to which foo
points, and as foo
is initialized with nullptr
(or if it didn't have any initialization,
this would still be the case), its value is zero. Here, undefined
behavior has been hit, so anything can happen at all and the program
is rendered useless. Normally, making a call to such an invalid address
results in segmentation fault errors, hence the message we get when
executing the program.
现在让我们检查相同的程序,但对其进行了优化:
Now let's examine the same program but compiling it with optimizations on:
$ clang++ -std=c++17 -O3 a.cpp && ./a.out
formatting hard disk drive!
该版本生成的代码如下:
The generated code for this version is as follows:
never_called(): # @never_called()
ret
main: # @main
push rax
mov edi, .L.str
call puts
xor eax, eax
pop rcx
ret
.L.str:
.asciz "formatting hard disk drive!"
有趣的是,不知何故,优化修改了程序,使得main
直接调用 std::puts
.但是为什么 Clang 会这样做呢?为什么是never_call
编译成单个 ret
指令?
Interestingly, somehow optimizations modified the program so that
main
calls std::puts
directly. But why did Clang do that? And why is
never_called
compiled to a single ret
instruction?
让我们暂时回到标准(特别是 N4660).什么它是关于未定义的行为吗?
Let's get back to the standard (N4660, specifically) for a moment. What does it say about undefined behavior?
3.27 未定义行为 [defns.undefined]
3.27 undefined behavior [defns.undefined]
本文档没有要求的行为
[注意:当本文档省略时,可能会出现未定义的行为任何明确的行为定义或当程序使用错误的构造或错误的数据. 允许的未定义行为范围从完全无视情况导致不可预测的结果,到翻译过程中的行为或程序执行过程中的记录方式环境特征(无论是否发布诊断消息),终止翻译或执行(与发出诊断消息).许多错误的程序结构不要产生未定义的行为;他们需要被诊断出来.常量表达式的求值永远不会显式地表现出行为指定为未定义 ([expr.const]).―― 尾注]
[Note: Undefined behavior may be expected when this document omits any explicit definition of behavior or when a program uses an erroneous construct or erroneous data. Permissible undefined behavior ranges from ignoring the situation completely with unpredictable results, to behaving during translation or program execution in a documented manner characteristic of the environment (with or without the issuance of a diagnostic message), to terminating a translation or execution (with the issuance of a diagnostic message). Many erroneous program constructs do not engender undefined behavior; they are required to be diagnosed. Evaluation of a constant expression never exhibits behavior explicitly specified as undefined ([expr.const]). ―?end note]
强调我的.
表现出未定义行为的程序变得毫无用处,因为一切它到目前为止已经完成并且将做进一步没有意义,如果它包含错误的数据或结构.考虑到这一点,请记住编译器可能会完全忽略未定义行为的情况被击中,这实际上在优化a时被用作发现的事实程序.例如,像 x + 1 > 这样的结构.x
(其中 x
是有符号整数)将被优化为常量,true
,即使 x
的值在编译时未知.推理是编译器想要针对有效情况进行优化,并且唯一的该构造有效的方法是当它不触发算术时溢出(即如果 x != std::numeric_limits
).这是优化器中新学到的事实.基于此,构造是证明总是评估为真.
A program that exhibits undefined behavior becomes useless, as everything
it has done so far and will do further has no meaning if it contains
erroneous data or constructs. With that in mind, do remember that
compilers may completely ignore for the case when undefined behavior
is hit, and this actually is used as discovered facts when optimizing a
program. For instance, a construct like x + 1 > x
(where x
is a signed integer) will be optimized away to a constant,
true
, even if the value of x
is unknown at compile-time. The reasoning
is that the compiler wants to optimize for valid cases, and the only
way for that construct to be valid is when it doesn't trigger arithmetic
overflow (i.e. if x != std::numeric_limits<decltype(x)>::max()
). This
is a new learned fact in the optimizer. Based on that, the construct is
proven to always evaluate to true.
注意:对于无符号整数不能进行同样的优化,因为溢出的不是 UB.也就是说,编译器需要保持表达式原样,因为当溢出发生时它可能有不同的评估(无符号是模块 2N,其中 N 是位数).为无符号整数优化它会不符合标准(感谢 aschepler).
Note: this same optimization can't occur for unsigned integers, because overflowing one is not UB. That is, the compiler needs to keep the expression as it is, as it might have a different evaluation when overflow occurs (unsigned is module 2N, where N is number of bits). Optimizing it away for unsigned integers would be incompliant with the standard (thanks aschepler).
这很有用,因为它允许 对踢在.所以到目前为止,一切都很好,但是如果 x
在运行时保持其最大值会发生什么?嗯,这是未定义的行为,所以尝试推理是无稽之谈它,因为任何事情都可能发生,标准没有强加任何要求.
This is useful as it allows for tons of optimizations to kick
in. So
far, so good, but what happens if x
holds its maximum value at runtime?
Well, that is undefined behavior, so it's nonsense to try to reason about
it, as anything may happen and the standard imposes no requirements.
现在我们有足够的信息来更好地检查您的错误程序.我们已经知道访问空指针是未定义的行为,这就是导致运行时有趣行为的原因.因此,让我们尝试了解为什么 Clang(或技术上的 LLVM)进行了优化程序的方式.
Now we have enough information in order to better examine your faulty program. We already know that accessing a null pointer is undefined behavior, and that's what's causing the funny behavior at runtime. So let's try and understand why Clang (or technically LLVM) optimized the program the way it did.
static void (*foo)() = nullptr;
static void format_disk()
{
std::puts("formatting hard disk drive!");
}
void never_called()
{
foo = format_disk;
}
int main()
{
foo();
}
请记住,可以在 main
条目之前调用 never_call
开始执行.例如,在声明顶级变量时,您可以在初始化该变量的值时调用它:
Remember that it's possible to call never_called
before the main
entry
starts executing. For example, when declaring a top-level variable,
you can call it while initializing the value of that variable:
void never_called();
int x = (never_called(), 42);
如果您在程序中编写此代码段,则程序不会更长的表现出未定义的行为,并且消息 格式化困难磁盘驱动器!" 会显示,优化可以打开也可以关闭.
If you write this snippet in your program, the program no longer exhibits undefined behavior, and the message "formatting hard disk drive!" is displayed, with optimizations either on or off.
那么这个程序有效的唯一方法是什么?有这个 never_caled
将 format_disk
的地址分配给 foo
的函数,所以我们可以在这里找到一些东西.注意 foo
被标记为 static
,这意味着它具有内部链接,无法从该翻译外部访问单元.相比之下,函数 never_call
具有外部链接,并且可能可以从外部访问.如果另一个翻译单元包含一个片段和上面一样,那么这个程序就生效了.
So what's the only way this program is valid? There's this never_caled
function that assigns the address of format_disk
to foo
, so we might
find something here. Note that foo
is marked as static
, which means it
has internal linkage and can't be accessed from outside this translation
unit. In contrast, the function never_called
has external linkage, and may
be accessed from outside. If another translation unit contains a snippet
like the one above, then this program becomes valid.
很酷,但是没有人从外面调用 never_call
.尽管这事实上,优化器认为该程序的唯一方法是有效是在 main
执行之前调用 never_call
,否则为只是未定义的行为.这是一个新的事实,因此编译器假定 never_call
实际上被称为.基于这些新知识,其他优化加入可能会利用它.
Cool, but there's no one calling never_called
from outside. Even though this
is the fact, the optimizer sees that the only way for this program to
be valid is if never_called
is called before main
executes, otherwise it's
just undefined behavior. That's a new learned fact, so the compiler assumes never_called
is in fact called. Based on that new knowledge, other optimizations that
kick in may take advantage of it.
例如,当 constant折叠是应用,它发现构造 foo()
只有在 foo
可以正确初始化时才有效.发生这种情况的唯一方法是如果在此翻译单元之外调用 never_call
,那么 foo = format_disk
.
For instance, when constant
folding is
applied, it sees that the construct foo()
is only valid if foo
can be properly initialized. The only way for that to happen is if never_called
is called outside of this translation unit, so foo = format_disk
.
死代码消除和过程间优化 可能会发现如果 foo == format_disk
,那么 never_call
里面的代码是不需要的,因此函数的主体被转换为单个 ret
指令.
Dead code elimination and interprocedural optimization might find out that if foo == format_disk
, then the code inside never_called
is unneeded,
so the function's body is transformed into a single ret
instruction.
内联扩展优化看到foo == format_disk
,所以可以替换foo
的调用用它的身体.最后,我们得到了这样的结果:
Inline expansion optimization
sees that foo == format_disk
, so the call to foo
can be replaced
with its body. In the end, we end up with something like this:
never_called():
ret
main:
mov edi, .L.str
call puts
xor eax, eax
ret
.L.str:
.asciz "formatting hard disk drive!"
这在某种程度上相当于 Clang 的优化输出.当然,Clang 真正所做的可能(并且可能)不同,但优化仍然能够得出相同的结论.
Which is somewhat equivalent to the output of Clang with optimizations on. Of course, what Clang really did may (and might) be different, but optimizations are nonetheless capable of reaching the same conclusion.
通过优化检查 GCC 的输出,似乎没有费心去调查:
Examining GCC's output with optimizations on, it seems it didn't bother investigating:
.LC0:
.string "formatting hard disk drive!"
format_disk():
mov edi, OFFSET FLAT:.LC0
jmp puts
never_called():
mov QWORD PTR foo[rip], OFFSET FLAT:format_disk()
ret
main:
sub rsp, 8
call [QWORD PTR foo[rip]]
xor eax, eax
add rsp, 8
ret
执行该程序会导致崩溃(分段错误),但如果您在执行 main 之前在另一个翻译单元中调用 never_call
,则该程序不再表现出未定义的行为.
Executing that program results in a crash (segmentation fault), but if you call never_called
in another translation unit before main gets executed, then this program doesn't exhibit undefined behavior anymore.
随着越来越多的优化被设计出来,所有这一切都可能发生疯狂的变化,所以不要依赖于假设你的编译器会处理包含未定义行为的代码,它也可能会搞砸你(并格式化你的硬盘驱动器)真的!)
All of this can change crazily as more and more optimizations are engineered, so do not rely on the assumption that your compiler will take care of code containing undefined behavior, it might just screw you up as well (and format your hard drive for real!)
我建议您阅读 每个 C程序员应该了解未定义行为和C 和 C++ 中未定义行为指南,这两个系列文章都非常丰富,可能会帮助您了解最新技术.
I recommend you read What every C programmer should know about Undefined Behavior and A Guide to Undefined Behavior in C and C++, both article series are very informative and might help you out with understanding the state of art.
相关文章