为什么使用 GCC 的 x86 上的整数溢出会导致无限循环?

2021-12-18 00:00:00 gcc c c++ undefined-behavior x86

以下代码在 GCC 上进入无限循环:

#include <iostream>
using namespace std;

int main(){
    int i = 0x10000000;

    int c = 0;
    do{
        c++;
        i += i;
        cout << i << endl;
    }while (i > 0);

    cout << c << endl;
    return 0;
}

问题是这样的: 有符号整数溢出在技术上是未定义的行为.但是 x86 上的 GCC 使用 x86 整数指令实现整数运算 - 溢出.

So here's the deal: Signed integer overflow is technically undefined behavior. But GCC on x86 implements integer arithmetic using x86 integer instructions - which wrap on overflow.

因此,我原以为它会在溢出时换行――尽管它是未定义的行为.但显然情况并非如此.那么我错过了什么?

Therefore, I would have expected it to wrap on overflow - despite the fact that it is undefined behavior. But that's clearly not the case. So what did I miss?

我使用:

~/Desktop$ g++ main.cpp -O2

GCC 输出:

~/Desktop$ ./a.out
536870912
1073741824
-2147483648
0
0
0

... (infinite loop)

禁用优化后,没有无限循环,输出正确.Visual Studio 也正确编译了这个并给出了以下结果:

With optimizations disabled, there is no infinite loop and the output is correct. Visual Studio also correctly compiles this and gives the following result:

正确的输出:

~/Desktop$ g++ main.cpp
~/Desktop$ ./a.out
536870912
1073741824
-2147483648
3

以下是一些其他变体:

i *= 2;   //  Also fails and goes into infinite loop.
i <<= 1;  //  This seems okay. It does not enter infinite loop.

以下是所有相关的版本信息:

~/Desktop$ g++ -v
Using built-in specs.
COLLECT_GCC=g++
COLLECT_LTO_WRAPPER=/usr/lib/x86_64-linux-gnu/gcc/x86_64-linux-gnu/4.5.2/lto-wrapper
Target: x86_64-linux-gnu
Configured with: ..

...

Thread model: posix
gcc version 4.5.2 (Ubuntu/Linaro 4.5.2-8ubuntu4) 
~/Desktop$ 

所以问题是:这是 GCC 中的错误吗?还是我误解了 GCC 如何处理整数运算?

So the question is: Is this a bug in GCC? Or did I misunderstand something about how GCC handles integer arithmetic?

*我也标记了这个 C,因为我认为这个错误会在 C 中重现.(我还没有验证.)

*I'm tagging this C as well, because I assume this bug will reproduce in C. (I haven't verified it yet.)

这是循环的组装:(如果我正确识别)

Here's the assembly of the loop: (if I recognized it properly)

.L5:
addl    %ebp, %ebp
movl    $_ZSt4cout, %edi
movl    %ebp, %esi
.cfi_offset 3, -40
call    _ZNSolsEi
movq    %rax, %rbx
movq    (%rax), %rax
movq    -24(%rax), %rax
movq    240(%rbx,%rax), %r13
testq   %r13, %r13
je  .L10
cmpb    $0, 56(%r13)
je  .L3
movzbl  67(%r13), %eax
.L4:
movsbl  %al, %esi
movq    %rbx, %rdi
addl    $1, %r12d
call    _ZNSo3putEc
movq    %rax, %rdi
call    _ZNSo5flushEv
cmpl    $3, %r12d
jne .L5

推荐答案

当标准说它是未定义的行为时,就是它.任何事情都可能发生.任何事情"包括通常是整数环绕,但有时会发生奇怪的事情".

When the standard says it's undefined behavior, it means it. Anything can happen. "Anything" includes "usually integers wrap around, but on occasion weird stuff happens".

是的,在 x86 CPU 上,整数通常以您期望的方式包装.这是其中一种例外情况. 编译器假定您不会导致未定义的行为,并且优化掉循环测试.如果你真的想要环绕,编译时将 -fwrapv 传递给 g++gcc ;这为您提供了明确定义的(二进制补码)溢出语义,但会影响性能.

Yes, on x86 CPUs, integers usually wrap the way you expect. This is one of those exceptions. The compiler assumes you won't cause undefined behavior, and optimizes away the loop test. If you really want wraparound, pass -fwrapv to g++ or gcc when compiling; this gives you well-defined (twos-complement) overflow semantics, but can hurt performance.

相关文章