为什么增强的 GCC 6 优化器会破坏实用的 C++ 代码?

GCC 6 有一个新的优化器功能:它假设 this始终不为 null 并基于此进行优化.

GCC 6 has a new optimizer feature: It assumes that this is always not null and optimizes based on that.

值范围传播现在假定 C++ 成员函数的 this 指针为非空.这消除了常见的空指针检查但也破坏了一些不符合规范的代码库(例如 Qt-5、Chromium、KDevelop).可以使用 -fno-delete-null-pointer-checks 作为临时解决方法.可以使用 -fsanitize=undefined 来识别错误代码.

Value range propagation now assumes that the this pointer of C++ member functions is non-null. This eliminates common null pointer checks but also breaks some non-conforming code-bases (such as Qt-5, Chromium, KDevelop). As a temporary work-around -fno-delete-null-pointer-checks can be used. Wrong code can be identified by using -fsanitize=undefined.

变更文档明确指出这是危险的,因为它破坏了大量常用代码.

The change document clearly calls this out as dangerous because it breaks a surprising amount of frequently used code.

为什么这个新假设会破坏实际的 C++ 代码? 是否有一些特定的模式让粗心或不知情的程序员依赖于这种特定的未定义行为?我无法想象有人会写 if (this == NULL) 因为那太不自然了.

Why would this new assumption break practical C++ code? Are there particular patterns where careless or uninformed programmers rely on this particular undefined behavior? I cannot imagine anyone writing if (this == NULL) because that is so unnatural.

推荐答案

我想需要回答的问题是为什么好心的人会首先写支票.

I guess the question that needs to be answered why well-intentioned people would write the checks in the first place.

最常见的情况可能是,如果您的类是自然发生的递归调用的一部分.

The most common case is probably if you have a class that is part of a naturally occurring recursive call.

如果你有:

struct Node
{
    Node* left;
    Node* right;
};

在 C 中,你可以这样写:

in C, you might write:

void traverse_in_order(Node* n) {
    if(!n) return;
    traverse_in_order(n->left);
    process(n);
    traverse_in_order(n->right);
}

在 C++ 中,将其设为成员函数很好:

In C++, it's nice to make this a member function:

void Node::traverse_in_order() {
    // <--- What check should be put here?
    left->traverse_in_order();
    process();
    right->traverse_in_order();
}

在 C++ 的早期(标准化之前),人们强调成员函数是函数的语法糖,其中 this 参数是隐式的.代码是用 C++ 编写的,转换为等效的 C 并编译.甚至有明确的例子表明,将 this 与 null 进行比较是有意义的,原始的 Cfront 编译器也利用了这一点.所以来自 C 背景,检查的明显选择是:

In the early days of C++ (prior to standardization), it was emphasized that that member functions were syntactic sugar for a function where the this parameter is implicit. Code was written in C++, converted to equivalent C and compiled. There were even explicit examples that comparing this to null was meaningful and the original Cfront compiler took advantage of this too. So coming from a C background, the obvious choice for the check is:

if(this == nullptr) return;      

注意:Bjarne Stroustrup 甚至提到 this 的规则多年来发生了变化 这里

Note: Bjarne Stroustrup even mentions that the rules for this have changed over the years here

这在许多编译器上工作了很多年.当标准化发生时,情况发生了变化.最近,编译器开始利用调用成员函数的优势,其中 thisnullptr 是未定义的行为,这意味着此条件始终为 false,编译器可以随意省略它.

And this worked on many compilers for many years. When standardization happened, this changed. And more recently, compilers started taking advantage of calling a member function where this being nullptr is undefined behavior, which means that this condition is always false, and the compiler is free to omit it.

这意味着要对该树进行任何遍历,您需要:

That means that to do any traversal of this tree, you need to either:

  • 在调用traverse_in_order

void Node::traverse_in_order() {
    if(left) left->traverse_in_order();
    process();
    if(right) right->traverse_in_order();
}

这意味着还要检查每个呼叫站点是否有空根.

This means also checking at EVERY call site if you could have a null root.

不要使用成员函数

这意味着您正在编写旧的 C 样式代码(可能作为静态方法),并使用对象作为参数显式调用它.例如.你又回到了在调用站点编写 Node::traverse_in_order(node); 而不是 node->traverse_in_order();.

This means that you're writing the old C style code (perhaps as a static method), and calling it with the object explicitly as a parameter. eg. you're back to writing Node::traverse_in_order(node); rather than node->traverse_in_order(); at the call site.

我认为以符合标准的方式修复这个特定示例的最简单/最简洁的方法是实际使用哨兵节点而不是 nullptr.

I believe the easiest/neatest way to fix this particular example in a way that is standards compliant is to actually use a sentinel node rather than a nullptr.

// static class, or global variable
Node sentinel;

void Node::traverse_in_order() {
    if(this == &sentinel) return;
    ...
}

前两个选项似乎都没有吸引力,虽然代码可以逃脱,但他们使用 this == nullptr 而不是使用适当的修复编写了糟糕的代码.

Neither of the first two options seem that appealing, and while code could get away with it, they wrote bad code with this == nullptr instead of using a proper fix.

我猜这就是其中一些代码库如何演变为在其中包含 this == nullptr 检查.

I'm guessing that's how some of these code bases evolved to have this == nullptr checks in them.

相关文章