检测 C/C++ 中的有符号溢出
乍一看,这个问题似乎是 How检测整数溢出?,但实际上有很大不同.
At first glance, this question may seem like a duplicate of How to detect integer overflow?, however it is actually significantly different.
我发现虽然检测无符号整数溢出非常简单,但检测 C/C++ 中的有符号溢出实际上比大多数人想象的要困难.
I've found that while detecting an unsigned integer overflow is pretty trivial, detecting a signed overflow in C/C++ is actually more difficult than most people think.
最明显但最幼稚的方法是:
The most obvious, yet naive, way to do it would be something like:
int add(int lhs, int rhs)
{
int sum = lhs + rhs;
if ((lhs >= 0 && sum < rhs) || (lhs < 0 && sum > rhs)) {
/* an overflow has occurred */
abort();
}
return sum;
}
这样做的问题是,根据 C 标准,有符号整数溢出是未定义的行为. 换句话说,根据标准,一旦您甚至导致有符号溢出,您的程序就像取消引用空指针一样无效.所以你不能导致未定义的行为,然后在事后尝试检测溢出,如上面的后置条件检查示例.
The problem with this is that according to the C standard, signed integer overflow is undefined behavior. In other words, according to the standard, as soon as you even cause a signed overflow, your program is just as invalid as if you dereferenced a null pointer. So you can't cause undefined behavior, and then try to detect the overflow after the fact, as in the above post-condition check example.
尽管上述检查可能适用于许多编译器,但您不能指望它.事实上,因为 C 标准说有符号整数溢出是未定义的,一些编译器(比如 GCC)会优化掉上面的在设置优化标志时检查,因为编译器假定不可能发生有符号溢出.这完全打破了检查溢出的尝试.
Even though the above check is likely to work on many compilers, you can't count on it. In fact, because the C standard says signed integer overflow is undefined, some compilers (like GCC) will optimize away the above check when optimization flags are set, because the compiler assumes a signed overflow is impossible. This totally breaks the attempt to check for overflow.
因此,检查溢出的另一种可能方法是:
So, another possible way to check for overflow would be:
int add(int lhs, int rhs)
{
if (lhs >= 0 && rhs >= 0) {
if (INT_MAX - lhs <= rhs) {
/* overflow has occurred */
abort();
}
}
else if (lhs < 0 && rhs < 0) {
if (lhs <= INT_MIN - rhs) {
/* overflow has occurred */
abort();
}
}
return lhs + rhs;
}
这看起来更有希望,因为我们实际上不会将两个整数相加,直到我们提前确保执行这样的相加不会导致溢出.因此,我们不会导致任何未定义的行为.
This seems more promising, since we don't actually add the two integers together until we make sure in advance that performing such an add will not result in overflow. Thus, we don't cause any undefined behavior.
然而,不幸的是,此解决方案的效率远低于初始解决方案,因为您必须执行减法运算才能测试加法运算是否有效.即使你不关心这个(小的)性能损失,我仍然不完全相信这个解决方案是足够的.表达式 lhs <= INT_MIN - rhs
似乎与编译器可能优化掉的那种表达式完全一样,认为有符号溢出是不可能的.
However, this solution is unfortunately a lot less efficient than the initial solution, since you have to perform a subtract operation just to test if your addition operation will work. And even if you don't care about this (small) performance hit, I'm still not entirely convinced this solution is adequate. The expression lhs <= INT_MIN - rhs
seems exactly like the sort of expression the compiler might optimize away, thinking that signed overflow is impossible.
那么这里有更好的解决方案吗?保证 1) 不会导致未定义行为,以及 2) 不会为编译器提供优化溢出检查的机会?我在想可能有某种方法可以通过将两个操作数强制转换为无符号,并通过滚动您自己的二进制补码算法来执行检查,但我不确定如何做到这一点.
So is there a better solution here? Something that is guaranteed to 1) not cause undefined behavior, and 2) not provide the compiler with an opportunity to optimize away overflow checks? I was thinking there might be some way to do it by casting both operands to unsigned, and performing checks by rolling your own two's-complement arithmetic, but I'm not really sure how to do that.
推荐答案
您的减法方法是正确且明确的.编译器无法优化它.
Your approach with subtraction is correct and well-defined. A compiler cannot optimize it away.
另一种正确的方法,如果您有更大的整数类型可用,则在较大的类型中执行算术,然后在将其转换回来时检查结果是否适合较小的类型
Another correct approach, if you have a larger integer type available, is to perform the arithmetic in the larger type and then check that the result fits in the smaller type when converting it back
int sum(int a, int b)
{
long long c;
assert(LLONG_MAX>INT_MAX);
c = (long long)a + b;
if (c < INT_MIN || c > INT_MAX) abort();
return c;
}
好的编译器应该将整个加法和 if
语句转换为 int
大小的加法和单个条件跳转溢出,并且永远不会实际执行更大的加法.
A good compiler should convert the entire addition and if
statement into an int
-sized addition and a single conditional jump-on-overflow and never actually perform the larger addition.
正如斯蒂芬指出的那样,我在使用(不太好的)编译器 gcc 来生成正常的 asm 时遇到了麻烦.它生成的代码并不是很慢,但肯定不是最理想的.如果有人知道此代码的变体可以让 gcc 做正确的事情,我很乐意看到它们.
As Stephen pointed out, I'm having trouble getting a (not-so-good) compiler, gcc, to generate the sane asm. The code it generates is not terribly slow, but certainly suboptimal. If anyone knows variants on this code that will get gcc to do the right thing, I'd love to see them.
相关文章