为什么这个循环会产生“警告:迭代 3u 调用未定义的行为"?并输出超过4行?

2021-12-18 00:00:00 gcc c++ undefined-behavior

编译:

#include <iostream>

int main()
{
    for (int i = 0; i < 4; ++i)
        std::cout << i*1000000000 << std::endl;
}

gcc 产生以下警告:

and gcc produces the following warning:

warning: iteration 3u invokes undefined behavior [-Waggressive-loop-optimizations]
   std::cout << i*1000000000 << std::endl;
                  ^

我知道有符号整数溢出.

I understand there is a signed integer overflow.

我无法理解的是,为什么 i 值会被溢出操作破坏?

What I cannot get is why i value is broken by that overflow operation?

我已经阅读了 为什么带有 GCC 的 x86 上的整数溢出会导致无限循环?,但我仍然不清楚 为什么 会发生这种情况 - 我认为未定义"意味着任何事情都可能发生"",但是这种特定行为的根本原因是什么?

I've read the answers to Why does integer overflow on x86 with GCC cause an infinite loop?, but I'm still not clear on why this happens - I get that "undefined" means "anything can happen", but what's the underlying cause of this specific behavior?

在线:http://ideone.com/dMrRKR

编译器:gcc (4.8)

推荐答案

有符号整数溢出(严格来说,没有无符号整数溢出"这样的东西)意味着未定义的行为.这意味着任何事情都可能发生,讨论为什么会在 C++ 规则下发生是没有意义的.

Signed integer overflow (as strictly speaking, there is no such thing as "unsigned integer overflow") means undefined behaviour. And this means anything can happen, and discussing why does it happen under the rules of C++ doesn't make sense.

C++11 草案 N3337:§5.4:1

C++11 draft N3337: §5.4:1

如果在表达式的计算过程中,结果没有数学定义或不在以下范围内其类型的可表示值,则行为未定义.[注:大多数现有的 C++ 实现忽略整数溢出.除以零的处理,使用零除数形成余数,以及所有浮点异常因机器而异,通常可以通过库函数进行调整.――结尾说明]

If during the evaluation of an expression, the result is not mathematically de?ned or not in the range of representable values for its type, the behavior is unde?ned. [ Note: most existing implementations of C++ ignore integer over?ows. Treatment of division by zero, forming a remainder using a zero divisor, and all ?oating point exceptions vary among machines, and is usually adjustable by a library function. ―end note ]

您使用 g++ -O3 编译的代码会发出警告(即使没有 -Wall)

Your code compiled with g++ -O3 emits warning (even without -Wall)

a.cpp: In function 'int main()':
a.cpp:11:18: warning: iteration 3u invokes undefined behavior [-Waggressive-loop-optimizations]
   std::cout << i*1000000000 << std::endl;
                  ^
a.cpp:9:2: note: containing loop
  for (int i = 0; i < 4; ++i)
  ^

我们分析程序正在做什么的唯一方法是读取生成的汇编代码.

The only way we can analyze what the program is doing, is by reading the generated assembly code.

这是完整的程序集列表:

Here is the full assembly listing:

    .file   "a.cpp"
    .section    .text$_ZNKSt5ctypeIcE8do_widenEc,"x"
    .linkonce discard
    .align 2
LCOLDB0:
LHOTB0:
    .align 2
    .p2align 4,,15
    .globl  __ZNKSt5ctypeIcE8do_widenEc
    .def    __ZNKSt5ctypeIcE8do_widenEc;    .scl    2;  .type   32; .endef
__ZNKSt5ctypeIcE8do_widenEc:
LFB860:
    .cfi_startproc
    movzbl  4(%esp), %eax
    ret $4
    .cfi_endproc
LFE860:
LCOLDE0:
LHOTE0:
    .section    .text.unlikely,"x"
LCOLDB1:
    .text
LHOTB1:
    .p2align 4,,15
    .def    ___tcf_0;   .scl    3;  .type   32; .endef
___tcf_0:
LFB1091:
    .cfi_startproc
    movl    $__ZStL8__ioinit, %ecx
    jmp __ZNSt8ios_base4InitD1Ev
    .cfi_endproc
LFE1091:
    .section    .text.unlikely,"x"
LCOLDE1:
    .text
LHOTE1:
    .def    ___main;    .scl    2;  .type   32; .endef
    .section    .text.unlikely,"x"
LCOLDB2:
    .section    .text.startup,"x"
LHOTB2:
    .p2align 4,,15
    .globl  _main
    .def    _main;  .scl    2;  .type   32; .endef
_main:
LFB1084:
    .cfi_startproc
    leal    4(%esp), %ecx
    .cfi_def_cfa 1, 0
    andl    $-16, %esp
    pushl   -4(%ecx)
    pushl   %ebp
    .cfi_escape 0x10,0x5,0x2,0x75,0
    movl    %esp, %ebp
    pushl   %edi
    pushl   %esi
    pushl   %ebx
    pushl   %ecx
    .cfi_escape 0xf,0x3,0x75,0x70,0x6
    .cfi_escape 0x10,0x7,0x2,0x75,0x7c
    .cfi_escape 0x10,0x6,0x2,0x75,0x78
    .cfi_escape 0x10,0x3,0x2,0x75,0x74
    xorl    %edi, %edi
    subl    $24, %esp
    call    ___main
L4:
    movl    %edi, (%esp)
    movl    $__ZSt4cout, %ecx
    call    __ZNSolsEi
    movl    %eax, %esi
    movl    (%eax), %eax
    subl    $4, %esp
    movl    -12(%eax), %eax
    movl    124(%esi,%eax), %ebx
    testl   %ebx, %ebx
    je  L15
    cmpb    $0, 28(%ebx)
    je  L5
    movsbl  39(%ebx), %eax
L6:
    movl    %esi, %ecx
    movl    %eax, (%esp)
    addl    $1000000000, %edi
    call    __ZNSo3putEc
    subl    $4, %esp
    movl    %eax, %ecx
    call    __ZNSo5flushEv
    jmp L4
    .p2align 4,,10
L5:
    movl    %ebx, %ecx
    call    __ZNKSt5ctypeIcE13_M_widen_initEv
    movl    (%ebx), %eax
    movl    24(%eax), %edx
    movl    $10, %eax
    cmpl    $__ZNKSt5ctypeIcE8do_widenEc, %edx
    je  L6
    movl    $10, (%esp)
    movl    %ebx, %ecx
    call    *%edx
    movsbl  %al, %eax
    pushl   %edx
    jmp L6
L15:
    call    __ZSt16__throw_bad_castv
    .cfi_endproc
LFE1084:
    .section    .text.unlikely,"x"
LCOLDE2:
    .section    .text.startup,"x"
LHOTE2:
    .section    .text.unlikely,"x"
LCOLDB3:
    .section    .text.startup,"x"
LHOTB3:
    .p2align 4,,15
    .def    __GLOBAL__sub_I_main;   .scl    3;  .type   32; .endef
__GLOBAL__sub_I_main:
LFB1092:
    .cfi_startproc
    subl    $28, %esp
    .cfi_def_cfa_offset 32
    movl    $__ZStL8__ioinit, %ecx
    call    __ZNSt8ios_base4InitC1Ev
    movl    $___tcf_0, (%esp)
    call    _atexit
    addl    $28, %esp
    .cfi_def_cfa_offset 4
    ret
    .cfi_endproc
LFE1092:
    .section    .text.unlikely,"x"
LCOLDE3:
    .section    .text.startup,"x"
LHOTE3:
    .section    .ctors,"w"
    .align 4
    .long   __GLOBAL__sub_I_main
.lcomm __ZStL8__ioinit,1,1
    .ident  "GCC: (i686-posix-dwarf-rev1, Built by MinGW-W64 project) 4.9.0"
    .def    __ZNSt8ios_base4InitD1Ev;   .scl    2;  .type   32; .endef
    .def    __ZNSolsEi; .scl    2;  .type   32; .endef
    .def    __ZNSo3putEc;   .scl    2;  .type   32; .endef
    .def    __ZNSo5flushEv; .scl    2;  .type   32; .endef
    .def    __ZNKSt5ctypeIcE13_M_widen_initEv;  .scl    2;  .type   32; .endef
    .def    __ZSt16__throw_bad_castv;   .scl    2;  .type   32; .endef
    .def    __ZNSt8ios_base4InitC1Ev;   .scl    2;  .type   32; .endef
    .def    _atexit;    .scl    2;  .type   32; .endef

我什至几乎看不懂汇编,但即使我能看到 addl $1000000000, %edi 行.生成的代码看起来更像

I can barely even read assembly, but even I can see the addl $1000000000, %edi line. The resulting code looks more like

for(int i = 0; /* nothing, that is - infinite loop */; i += 1000000000)
    std::cout << i << std::endl;

@T.C. 的评论:

我怀疑它是这样的:(1) 因为 i 的任何大于 2 的值的每次迭代都有未定义的行为 ->(2) 我们可以假设 i <= 2 出于优化目的 ->(3) 循环条件始终为真 ->(4) 优化为无限循环.

I suspect that it's something like: (1) because every iteration with i of any value larger than 2 has undefined behavior -> (2) we can assume that i <= 2 for optimization purposes -> (3) the loop condition is always true -> (4) it's optimized away into an infinite loop.

让我想到将 OP 代码的汇编代码与以下代码的汇编代码进行比较,没有未定义的行为.

gave me idea to compare the assembly code of the OP's code to the assembly code of the following code, with no undefined behaviour.

#include <iostream>

int main()
{
    // changed the termination condition
    for (int i = 0; i < 3; ++i)
        std::cout << i*1000000000 << std::endl;
}

而且,实际上,正确的代码具有终止条件.

And, in fact, the correct code has termination condition.

    ; ...snip...
L6:
    mov ecx, edi
    mov DWORD PTR [esp], eax
    add esi, 1000000000
    call    __ZNSo3putEc
    sub esp, 4
    mov ecx, eax
    call    __ZNSo5flushEv
    cmp esi, -1294967296 // here it is
    jne L7
    lea esp, [ebp-16]
    xor eax, eax
    pop ecx
    ; ...snip...

不幸的是,这是编写错误代码的后果.

Unfortunately this is the consequences of writing buggy code.

幸运的是,您可以利用更好的诊断和更好的调试工具 - 这就是它们的用途:

Fortunately you can make use of better diagnostics and better debugging tools - that's what they are for:

  • 启用所有警告

  • enable all warnings

-Wall 是 gcc 选项,它启用所有有用的警告而没有误报.这是您应该始终使用的最低限度.

-Wall is the gcc option that enables all useful warnings with no false positives. This is a bare minimum that you should always use.

gcc 有许多其他警告选项,但是,它们没有通过 -Wall 启用,因为它们可能会在误报时发出警告

gcc has many other warning options, however, they are not enabled with -Wall as they may warn on false positives

不幸的是,Visual C++ 在提供有用警告的能力方面落后了.至少 IDE 默认启用了一些.

Visual C++ unfortunately is lagging behind with the ability to give useful warnings. At least the IDE enables some by default.

使用调试标志进行调试

  • 对于整数溢出 -ftrapv 在溢出时捕获程序,
  • Clang 编译器在这方面非常出色:-fcatch-undefined-behavior 捕获了许多未定义行为的实例(注意:很多"!=所有这些";)
  • for integer overflow -ftrapv traps the program on overflow,
  • Clang compiler is excellent for this: -fcatch-undefined-behavior catches a lot of instances of undefined behaviour (note: "a lot of" != "all of them")

我有一个不是我写的程序,需要明天发货!帮助!!!!!!111oneone

I have a spaghetti mess of a program not written by me that needs to be shipped tomorrow! HELP!!!!!!111oneone

使用gcc的-fwrapv

此选项指示编译器假设加法、减法和乘法的有符号算术溢出使用二进制补码表示法环绕.

This option instructs the compiler to assume that signed arithmetic overflow of addition, subtraction and multiplication wraps around using twos-complement representation.

1 - 此规则不适用于无符号整数溢出",正如 §3.9.1.4 所说

1 - this rule does not apply to "unsigned integer overflow", as §3.9.1.4 says that

无符号整数,声明为无符号,应遵守算术模 2n 的法则,其中 n 是数字特定大小的整数的值表示中的位数.

Unsigned integers, declared unsigned, shall obey the laws of arithmetic modulo 2n where n is the number of bits in the value representation of that particular size of integer.

和例如UINT_MAX + 1 的结果是数学定义的 - 由算术模 2n

and e.g. result of UINT_MAX + 1 is mathematically defined - by the rules of arithmetic modulo 2n

相关文章