在 C/C++ 中检测有符号溢出

2022-01-31 00:00:00 c signed c++ integer-overflow undefined-behavior

乍一看，这个问题可能与 How检测整数溢出?，但实际上有很大不同.

At first glance, this question may seem like a duplicate of How to detect integer overflow?, however it is actually significantly different.

我发现虽然检测无符号整数溢出非常简单，但在 C/C++ 中检测有符号溢出实际上比大多数人想象的要困难.

I've found that while detecting an unsigned integer overflow is pretty trivial, detecting a signed overflow in C/C++ is actually more difficult than most people think.

最明显但最幼稚的方法是:

The most obvious, yet naive, way to do it would be something like:

int add(int lhs, int rhs) { int sum = lhs + rhs; if ((lhs >= 0 && sum < rhs) || (lhs < 0 && sum > rhs)) { /* an overflow has occurred */ abort(); } return sum; }

问题在于，根据 C 标准，有符号整数溢出是未定义的行为. 换句话说，根据标准，一旦你甚至导致有符号溢出，你的程序就像您取消引用空指针一样无效.所以不能导致未定义的行为，事后再尝试检测溢出，如上面的后置条件检查示例.

The problem with this is that according to the C standard, signed integer overflow is undefined behavior. In other words, according to the standard, as soon as you even cause a signed overflow, your program is just as invalid as if you dereferenced a null pointer. So you can't cause undefined behavior, and then try to detect the overflow after the fact, as in the above post-condition check example.

尽管上述检查可能适用于许多编译器，但您不能指望它.事实上，因为 C 标准说有符号整数溢出是未定义的，一些编译器(如 GCC)会优化掉上面的设置优化标志时检查，因为编译器假定有符号溢出是不可能的.这完全破坏了检查溢出的尝试.

Even though the above check is likely to work on many compilers, you can't count on it. In fact, because the C standard says signed integer overflow is undefined, some compilers (like GCC) will optimize away the above check when optimization flags are set, because the compiler assumes a signed overflow is impossible. This totally breaks the attempt to check for overflow.

因此，检查溢出的另一种可能方法是:

So, another possible way to check for overflow would be:

int add(int lhs, int rhs) { if (lhs >= 0 && rhs >= 0) { if (INT_MAX - lhs <= rhs) { /* overflow has occurred */ abort(); } } else if (lhs < 0 && rhs < 0) { if (lhs <= INT_MIN - rhs) { /* overflow has occurred */ abort(); } } return lhs + rhs; }

这似乎更有希望，因为我们实际上不会将两个整数相加，直到我们事先确保执行这样的相加不会导致溢出.因此，我们不会导致任何未定义的行为.

This seems more promising, since we don't actually add the two integers together until we make sure in advance that performing such an add will not result in overflow. Thus, we don't cause any undefined behavior.

但是，不幸的是，此解决方案的效率远低于初始解决方案，因为您必须执行减法运算来测试您的加法运算是否有效.即使你不关心这个(小)性能损失，我仍然不完全相信这个解决方案是足够的.表达式 lhs <= INT_MIN - rhs 似乎与编译器可能优化掉的那种表达式完全一样，认为有符号溢出是不可能的.

However, this solution is unfortunately a lot less efficient than the initial solution, since you have to perform a subtract operation just to test if your addition operation will work. And even if you don't care about this (small) performance hit, I'm still not entirely convinced this solution is adequate. The expression lhs <= INT_MIN - rhs seems exactly like the sort of expression the compiler might optimize away, thinking that signed overflow is impossible.

那么这里有更好的解决方案吗?保证 1) 不会导致未定义的行为，以及 2) 不会为编译器提供优化溢出检查的机会?我在想可能有某种方法可以通过将两个操作数都转换为无符号数，并通过滚动你自己的补码算法来执行检查，但我不太确定该怎么做.

So is there a better solution here? Something that is guaranteed to 1) not cause undefined behavior, and 2) not provide the compiler with an opportunity to optimize away overflow checks? I was thinking there might be some way to do it by casting both operands to unsigned, and performing checks by rolling your own two's-complement arithmetic, but I'm not really sure how to do that.

推荐答案

您的减法方法是正确且定义明确的.编译器无法对其进行优化.

Your approach with subtraction is correct and well-defined. A compiler cannot optimize it away.

如果您有更大的整数类型可用，另一种正确的方法是在较大的类型中执行算术，然后在将其转换回时检查结果是否适合较小的类型

Another correct approach, if you have a larger integer type available, is to perform the arithmetic in the larger type and then check that the result fits in the smaller type when converting it back

int sum(int a, int b) { long long c; assert(LLONG_MAX>INT_MAX); c = (long long)a + b; if (c < INT_MIN || c > INT_MAX) abort(); return c; }

一个好的编译器应该将整个加法和 if 语句转换成一个 int 大小的加法和一个有条件的溢出跳转，并且从不实际执行更大的加法.

A good compiler should convert the entire addition and if statement into an int-sized addition and a single conditional jump-on-overflow and never actually perform the larger addition.

正如斯蒂芬指出的那样，我在获得一个(不太好的)编译器 gcc 来生成健全的 asm 时遇到了麻烦.它生成的代码不是很慢，但肯定不是最理想的.如果有人知道这段代码的变体可以让 gcc 做正确的事情，我很乐意看到它们.

As Stephen pointed out, I'm having trouble getting a (not-so-good) compiler, gcc, to generate the sane asm. The code it generates is not terribly slow, but certainly suboptimal. If anyone knows variants on this code that will get gcc to do the right thing, I'd love to see them.

相关文章