如何在避免未定义行为的同时将任意双精度数转换为整数?
假设我有一个接受 64 位整数的函数,我想调用它带有一个带有任意数值的 double
(即它可能非常大量级,甚至无限):
Let's say I've got a function that accepts a 64-bit integer, and I want to call
it with a double
with arbitrary numeric value (i.e. it may be very large in
magnitude, or even infinite):
void DoSomething(int64_t x);
double d = [...];
DoSomething(d);
C++11 标准中 [conv.fpint] 的第 1 段是这样说的:
Paragraph 1 of [conv.fpint] in the C++11 standard says this:
浮点类型的纯右值可以转换为整数类型.转换截断;也就是小数部分被丢弃.如果截断的值不能,则行为未定义以目标类型表示.
A prvalue of a floating point type can be converted to a prvalue of an integer type. The conversion trun- cates; that is, the fractional part is discarded. The behavior is undefined if the truncated value cannot be represented in the destination type.
因此上面有很多d
的值会导致undefined行为.我想转换为饱和,使值大于std::numeric_limits
(下面称为kint64max
),包括无穷大,成为那个值,同样用最小可表示价值.这似乎是自然的方法:
Therefore there are many values of d
above that will cause undefined
behavior. I would like conversion to saturate, so that values greater than
std::numeric_limits<int64_t>::max()
(called kint64max
below), including
infinity, become that value, and similarly with the minimum representable
value. This seems the natural approach:
double clamped = std::min(d, static_cast<double>(kint64max));
clamped = std::max(clamped, static_cast<double>(kint64min));
DoSomething(clamped);
但是,标准中的下一段是这样说的:
But, the next paragraph in the standard says this:
整数类型或无作用域枚举类型的纯右值可以是转换为浮点类型的纯右值.结果是准确的如果可能的话.如果要转换的值在值的范围内可以表示但值不能准确表示,它是下一个较低或更高的表现价值.
A prvalue of an integer type or of an unscoped enumeration type can be converted to a prvalue of a floating point type. The result is exact if possible. If the value being converted is in the range of values that can be represented but the value cannot be represented exactly, it is an implementation-defined choice of either the next lower or higher representable value.
所以 clamped
可能仍然是 kint64max + 1
,行为可能仍然是未定义.
So clamped
may still wind up being kint64max + 1
, and behavior may still be
undefined.
什么是最简单的便携方式来做我正在寻找的事情?积分如果它还优雅地处理 NaN
s.
What is the simplest portable way to do what I'm looking for? Bonus points if
it also gracefully handles NaN
s.
更新:更准确地说,我希望以下所有内容都适用于int64_t SafeCast(double)
解决这个问题的函数:
Update: To be more precise, I would like the following to all be true of an
int64_t SafeCast(double)
function that solves this problem:
对于任何双
d
,调用SafeCast(d)
不会执行未定义的行为根据标准,也不会抛出异常或以其他方式中止.
For any double
d
, callingSafeCast(d)
does not perform undefined behavior according to the standard, nor does it throw an exception or otherwise abort.
对于 [-2^63, 2^63)
范围内的任何双 d
,SafeCast(d) == static_cast
.也就是说,SafeCast
与 C++ 的一致定义后者的任何转换规则.
For any double d
in the range [-2^63, 2^63)
,
SafeCast(d) == static_cast<int64_t>(d)
. That is, SafeCast
agrees with C++'s
conversion rules wherever the latter is defined.
对于任何双 d >= 2^63
,SafeCast(d) == kint64max
.
对于任何双 d <-2^63
, SafeCast(d) == kint64min
.
我怀疑这里真正的困难在于弄清楚 d
是否在范围 [-2^63, 2^63)
.正如问题和其他评论中所讨论的答案,我认为使用 kint64max
到 double
的演员来测试鞋面由于未定义的行为,bound 是非首发.可能更有希望使用std::pow(2, 63)
,但我不知道这是否保证完全正确2^63.
I suspect the true difficulty here is in figuring out whether d
is in the
range [-2^63, 2^63)
. As discussed in the question and in comments to other
answers, I think using a cast of kint64max
to double
to test for the upper
bound is a non-starter due to undefined behavior. It may be more promising to
use std::pow(2, 63)
, but I don't know whether this is guaranteed to be exactly
2^63.
推荐答案
事实证明这比我想象的要简单.感谢迈克尔・奥莱利了解此解决方案的基本思想.
It turns out this is simpler to do than I thought. Thanks to Michael O'Reilly for the basic idea of this solution.
问题的核心是弄清楚截断的双精度是否会可表示为 int64_t
.您可以使用 std::frexp
一个>:
The heart of the matter is figuring out whether the truncated double will be
representable as an int64_t
. You can do this easily using std::frexp
:
#include <cmath>
#include <limits>
static constexpr int64_t kint64min = std::numeric_limits<int64_t>::min();
static constexpr int64_t kint64max = std::numeric_limits<int64_t>::max();
int64_t SafeCast(double d) {
// We must special-case NaN, for which the logic below doesn't work.
if (std::isnan(d)) {
return 0;
}
// Find that exponent exp such that
// d == x * 2^exp
// for some x with abs(x) in [0.5, 1.0). Note that this implies that the
// magnitude of d is strictly less than 2^exp.
//
// If d is infinite, the call to std::frexp is legal but the contents of exp
// are unspecified.
int exp;
std::frexp(d, &exp);
// If the magnitude of d is strictly less than 2^63, the truncated version
// of d is guaranteed to be representable. The only representable integer
// for which this is not the case is kint64min, but it is covered by the
// logic below.
if (std::isfinite(d) && exp <= 63) {
return d;
}
// Handle infinities and finite numbers with magnitude >= 2^63.
return std::signbit(d) ? kint64min : kint64max;
}
相关文章