为什么定义了无符号整数溢出行为但没有定义有符号整数溢出?
无符号整数溢出在 C 和 C++ 标准中都有很好的定义.例如,C99 标准(§6.2.5/9
) 状态
Unsigned integer overflow is well defined by both the C and C++ standards. For example, the C99 standard (§6.2.5/9
) states
涉及无符号操作数的计算永远不会溢出,因为不能用生成的无符号整数类型表示的结果是以比最大值大一的数字为模减少,可以是由结果类型表示.
A computation involving unsigned operands can never over?ow, because a result that cannot be represented by the resulting unsigned integer type is reduced modulo the number that is one greater than the largest value that can be represented by the resulting type.
但是,两个标准都规定有符号整数溢出是未定义的行为.同样,来自 C99 标准 (§3.4.3/1
)
However, both standards state that signed integer overflow is undefined behavior. Again, from the C99 standard (§3.4.3/1
)
未定义行为的一个例子是整数溢出行为
An example of unde?ned behavior is the behavior on integer over?ow
这种差异是否有历史原因或(甚至更好!)技??术原因?
Is there an historical or (even better!) a technical reason for this discrepancy?
推荐答案
历史原因是大多数 C 实现(编译器)只是使用最容易实现的溢出行为,并使用它使用的整数表示.C 实现通常使用 CPU 使用的相同表示形式 - 因此溢出行为遵循 CPU 使用的整数表示形式.
The historical reason is that most C implementations (compilers) just used whatever overflow behaviour was easiest to implement with the integer representation it used. C implementations usually used the same representation used by the CPU - so the overflow behavior followed from the integer representation used by the CPU.
在实践中,只有符号值的表示可能会根据实现而有所不同:一个补码、二进制补码、符号大小.对于无符号类型,标准没有理由允许变化,因为只有一种明显的二进制表示(标准只允许二进制表示).
In practice, it is only the representations for signed values that may differ according to the implementation: one's complement, two's complement, sign-magnitude. For an unsigned type there is no reason for the standard to allow variation because there is only one obvious binary representation (the standard only allows binary representation).
相关引述:
C99 6.2.6.1:3:
存储在无符号位域和无符号字符类型对象中的值应使用纯二进制表示法表示.
Values stored in unsigned bit-fields and objects of type unsigned char shall be represented using a pure binary notation.
C99 6.2.6.2:2:
如果符号位为1,则通过以下方式之一修改该值:
If the sign bit is one, the value shall be modified in one of the following ways:
――符号位为0的对应值取反(符号和幅度);
― the corresponding value with sign bit 0 is negated (sign and magnitude);
――符号位的值为 -(2N) (二的补码);
― the sign bit has the value ?(2N) (two’s complement);
――符号位的值为 -(2N - 1) (一个补码).
― the sign bit has the value ?(2N ? 1) (one’s complement).
现在,所有处理器都使用二进制补码表示,但有符号算术溢出仍未定义,编译器制造商希望它保持未定义,因为他们使用这种未定义来帮助优化.例如,请参阅 Ian Lance Taylor 的 博文 或此 投诉来自 Agner Fog,以及他的错误报告的答案.
Nowadays, all processors use two's complement representation, but signed arithmetic overflow remains undefined and compiler makers want it to remain undefined because they use this undefinedness to help with optimization. See for instance this blog post by Ian Lance Taylor or this complaint by Agner Fog, and the answers to his bug report.
相关文章