How can the following program be calling format_disk if it's never called in code?

#include <cstdio>

static void format_disk()
  std::puts("formatting hard disk drive!");

static void (*foo)() = nullptr;

void never_called()
  foo = format_disk;

int main()

This differs from compiler to compiler. Compiling with Clang with optimizations on, the function never_called executes at runtime.

$ clang++ -std=c++17 -O3 a.cpp && ./a.out
formatting hard disk drive!

Compiling with GCC, however, this code just crashes:

$ g++ -std=c++17 -O3 a.cpp && ./a.out
Segmentation fault (core dumped)


$ clang --version
clang version 5.0.0 (tags/RELEASE_500/final)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /usr/bin
$ gcc --version
gcc (GCC) 7.2.1 20171128
The program contains undefined behavior, as dereferencing a null pointer (i.e. calling foo() in main without assigning a valid address to it beforehand) is UB, therefore no requirements are imposed by the standard.

Executing format_disk at runtime is a perfect valid situation when undefined behavior has been hit, it's as valid as just crashing (like when compiled with GCC). Okay, but why is Clang doing that? If you compile it with optimizations off, the program will no longer output "formatting hard disk drive", and will just crash:

$ clang++ -std=c++17 -O0 a.cpp && ./a.out
Segmentation fault (core dumped)


The generated code for this version is as follows:

main:                                   # @main
        push    rbp
        mov     rbp, rsp
        call    qword ptr [foo]
        xor     eax, eax
        pop     rbp

It tries to make a call to a function to which foo points, and as foo is initialized with nullptr (or if it didn't have any initialization, this would still be the case), its value is zero. Here, undefined behavior has been hit, so anything can happen at all and the program is rendered useless. Normally, making a call to such an invalid address results in segmentation fault errors, hence the message we get when executing the program.


Now let's examine the same program but compiling it with optimizations on:

$ clang++ -std=c++17 -O3 a.cpp && ./a.out
formatting hard disk drive!


The generated code for this version is as follows:

never_called():                         # @never_called()
main:                                   # @main
        push    rax
        mov     edi, .L.str
        call    puts
        xor     eax, eax
        pop     rcx
        .asciz  "formatting hard disk drive!"

Interestingly, somehow optimizations modified the program so that main calls std::puts directly. But why did Clang do that? And why is never_called compiled to a single ret instruction?

Let's get back to the standard (N4660, specifically) for a moment. What does it say about undefined behavior?

3.27 undefined behavior [defns.undefined]


[Note: Undefined behavior may be expected when this document omits any explicit definition of behavior or when a program uses an erroneous construct or erroneous data. Permissible undefined behavior ranges from ignoring the situation completely with unpredictable results, to behaving during translation or program execution in a documented manner characteristic of the environment (with or without the issuance of a diagnostic message), to terminating a translation or execution (with the issuance of a diagnostic message). Many erroneous program constructs do not engender undefined behavior; they are required to be diagnosed. Evaluation of a constant expression never exhibits behavior explicitly specified as undefined ([expr.const]). ―?end note]


A program that exhibits undefined behavior becomes useless, as everything it has done so far and will do further has no meaning if it contains erroneous data or constructs. With that in mind, do remember that compilers may completely ignore for the case when undefined behavior is hit, and this actually is used as discovered facts when optimizing a program. For instance, a construct like x + 1 > x (where x is a signed integer) will be optimized away to a constant, true, even if the value of x is unknown at compile-time. The reasoning is that the compiler wants to optimize for valid cases, and the only way for that construct to be valid is when it doesn't trigger arithmetic overflow (i.e. if x != std::numeric_limits<decltype(x)>::max()). This is a new learned fact in the optimizer. Based on that, the construct is proven to always evaluate to true.

注意:对于无符号整数不能进行同样的优化,因为溢出的不是 UB.也就是说,编译器需要保持表达式原样,因为当溢出发生时它可能有不同的评估(无符号是模块 2N,其中 N 是位数).为无符号整数优化它会不符合标准(感谢 aschepler).

Note: this same optimization can't occur for unsigned integers, because overflowing one is not UB. That is, the compiler needs to keep the expression as it is, as it might have a different evaluation when overflow occurs (unsigned is module 2N, where N is number of bits). Optimizing it away for unsigned integers would be incompliant with the standard (thanks aschepler).

这很有用,因为它允许 对踢在.所以到目前为止,一切都很好,但是如果 x 在运行时保持其最大值会发生什么?嗯,这是未定义的行为,所以尝试推理是无稽之谈它,因为任何事情都可能发生,标准没有强加任何要求.

This is useful as it allows for tons of optimizations to kick in. So far, so good, but what happens if x holds its maximum value at runtime? Well, that is undefined behavior, so it's nonsense to try to reason about it, as anything may happen and the standard imposes no requirements.

现在我们有足够的信息来更好地检查您的错误程序.我们已经知道访问空指针是未定义的行为,这就是导致运行时有趣行为的原因.因此,让我们尝试了解为什么 Clang(或技术上的 LLVM)进行了优化程序的方式.

Now we have enough information in order to better examine your faulty program. We already know that accessing a null pointer is undefined behavior, and that's what's causing the funny behavior at runtime. So let's try and understand why Clang (or technically LLVM) optimized the program the way it did.

static void (*foo)() = nullptr;

static void format_disk()
  std::puts("formatting hard disk drive!");

void never_called()
  foo = format_disk;

int main()

Remember that it's possible to call never_called before the main entry starts executing. For example, when declaring a top-level variable, you can call it while initializing the value of that variable:

void never_called();
int x = (never_called(), 42);

If you write this snippet in your program, the program no longer exhibits undefined behavior, and the message "formatting hard disk drive!" is displayed, with optimizations either on or off.

那么这个程序有效的唯一方法是什么?有这个 never_caledformat_disk 的地址分配给 foo 的函数,所以我们可以在这里找到一些东西.注意 foo 被标记为 static,这意味着它具有内部链接,无法从该翻译外部访问单元.相比之下,函数 never_call 具有外部链接,并且可能可以从外部访问.如果另一个翻译单元包含一个片段和上面一样,那么这个程序就生效了.

So what's the only way this program is valid? There's this never_caled function that assigns the address of format_disk to foo, so we might find something here. Note that foo is marked as static, which means it has internal linkage and can't be accessed from outside this translation unit. In contrast, the function never_called has external linkage, and may be accessed from outside. If another translation unit contains a snippet like the one above, then this program becomes valid.

很酷,但是没有人从外面调用 never_call.尽管这事实上,优化器认为该程序的唯一方法是有效是在 main 执行之前调用 never_call,否则为只是未定义的行为.这是一个新的事实,因此编译器假定 never_call实际上被称为.基于这些新知识,其他优化加入可能会利用它.

Cool, but there's no one calling never_called from outside. Even though this is the fact, the optimizer sees that the only way for this program to be valid is if never_called is called before main executes, otherwise it's just undefined behavior. That's a new learned fact, so the compiler assumes never_called is in fact called. Based on that new knowledge, other optimizations that kick in may take advantage of it.

例如,当 constant折叠是应用,它发现构造 foo() 只有在 foo 可以正确初始化时才有效.发生这种情况的唯一方法是如果在此翻译单元之外调用 never_call,那么 foo = format_disk.

For instance, when constant folding is applied, it sees that the construct foo() is only valid if foo can be properly initialized. The only way for that to happen is if never_called is called outside of this translation unit, so foo = format_disk.

死代码消除和过程间优化 可能会发现如果 foo == format_disk,那么 never_call 里面的代码是不需要的,因此函数的主体被转换为单个 ret 指令.

Dead code elimination and interprocedural optimization might find out that if foo == format_disk, then the code inside never_called is unneeded, so the function's body is transformed into a single ret instruction.

内联扩展优化看到foo == format_disk,所以可以替换foo的调用用它的身体.最后,我们得到了这样的结果:

Inline expansion optimization sees that foo == format_disk, so the call to foo can be replaced with its body. In the end, we end up with something like this:

        mov     edi, .L.str
        call    puts
        xor     eax, eax
        .asciz  "formatting hard disk drive!"

这在某种程度上相当于 Clang 的优化输出.当然,Clang 真正所做的可能(并且可能)不同,但优化仍然能够得出相同的结论.

Which is somewhat equivalent to the output of Clang with optimizations on. Of course, what Clang really did may (and might) be different, but optimizations are nonetheless capable of reaching the same conclusion.

Examining GCC's output with optimizations on, it seems it didn't bother investigating:

        .string "formatting hard disk drive!"
        mov     edi, OFFSET FLAT:.LC0
        jmp     puts
        mov     QWORD PTR foo[rip], OFFSET FLAT:format_disk()
        sub     rsp, 8
        call    [QWORD PTR foo[rip]]
        xor     eax, eax
        add     rsp, 8

执行该程序会导致崩溃(分段错误),但如果您在执行 main 之前在另一个翻译单元中调用 never_call,则该程序不再表现出未定义的行为.

Executing that program results in a crash (segmentation fault), but if you call never_called in another translation unit before main gets executed, then this program doesn't exhibit undefined behavior anymore.


All of this can change crazily as more and more optimizations are engineered, so do not rely on the assumption that your compiler will take care of code containing undefined behavior, it might just screw you up as well (and format your hard drive for real!)

I recommend you read What every C programmer should know about Undefined Behavior and A Guide to Undefined Behavior in C and C++, both article series are very informative and might help you out with understanding the state of art.
