启用优化的不同浮点结果 - 编译器错误?
以下代码在经过和不经过优化的 Visual Studio 2008 上都有效.但它只适用于没有优化的 g++ (O0).
The below code works on Visual Studio 2008 with and without optimization. But it only works on g++ without optimization (O0).
#include <cstdlib>
#include <iostream>
#include <cmath>
double round(double v, double digit)
{
double pow = std::pow(10.0, digit);
double t = v * pow;
//std::cout << "t:" << t << std::endl;
double r = std::floor(t + 0.5);
//std::cout << "r:" << r << std::endl;
return r / pow;
}
int main(int argc, char *argv[])
{
std::cout << round(4.45, 1) << std::endl;
std::cout << round(4.55, 1) << std::endl;
}
输出应该是:
4.5
4.6
但是经过优化 (O1
- O3
) 的 g++ 会输出:
But g++ with optimization (O1
- O3
) will output:
4.5
4.5
如果我在 t 之前添加 volatile
关键字,它会起作用,那么会不会有某种优化错误?
If I add the volatile
keyword before t, it works, so might there be some kind of optimization bug?
在 g++ 4.1.2 和 4.4.4 上测试.
Test on g++ 4.1.2, and 4.4.4.
这是ideone上的结果:http://ideone.com/Rz937
Here is the result on ideone: http://ideone.com/Rz937
我在 g++ 上测试的选项很简单:
And the option I test on g++ is simple:
g++ -O2 round.cpp
更有趣的结果,即使我在 Visual Studio 2008 上打开了 /fp:fast
选项,结果仍然是正确的.
The more interesting result, even I turn on /fp:fast
option on Visual Studio 2008, the result still is correct.
进一步的问题:
我想知道,我应该总是打开 -ffloat-store
选项吗?
I was wondering, should I always turn on the -ffloat-store
option?
因为我测试的 g++ 版本附带 CentOS/Red Hat Linux 5 和 CentOS/Redhat 6.
Because the g++ version I tested is shipped with CentOS/Red Hat Linux 5 and CentOS/Redhat 6.
我在这些平台下编译了很多程序,我担心它会在我的程序中导致意外的错误.调查我所有的 C++ 代码和使用的库是否存在此类问题似乎有点困难.有什么建议吗?
I compiled many of my programs under these platforms, and I am worried it will cause unexpected bugs inside my programs. It seems a little difficult to investigate all my C++ code and used libraries whether they have such problems. Any suggestion?
有没有人想知道为什么即使 /fp:fast
打开,Visual Studio 2008 仍然有效?在这个问题上,Visual Studio 2008 似乎比 g++ 更可靠?
Is anyone interested in why even /fp:fast
turned on, Visual Studio 2008 still works? It seems like Visual Studio 2008 is more reliable at this problem than g++?
推荐答案
Intel x86 处理器在内部使用 80 位扩展精度,而 double
通常是 64 位宽.不同的优化级别会影响来自 CPU 的浮点值保存到内存中的频率,从而从 80 位精度四舍五入到 64 位精度.
Intel x86 processors use 80-bit extended precision internally, whereas double
is normally 64-bit wide. Different optimization levels affect how often floating point values from CPU get saved into memory and thus rounded from 80-bit precision to 64-bit precision.
使用 -ffloat-store
gcc 选项可以在不同优化级别下获得相同的浮点结果.
Use the -ffloat-store
gcc option to get the same floating point results with different optimization levels.
或者,使用 long double
类型,它在 gcc 上通常为 80 位宽,以避免从 80 位精度舍入到 64 位精度.
Alternatively, use the long double
type, which is normally 80-bit wide on gcc to avoid rounding from 80-bit to 64-bit precision.
man gcc
说明了一切:
-ffloat-store
Do not store floating point variables in registers, and inhibit
other options that might change whether a floating point value is
taken from a register or memory.
This option prevents undesirable excess precision on machines such
as the 68000 where the floating registers (of the 68881) keep more
precision than a "double" is supposed to have. Similarly for the
x86 architecture. For most programs, the excess precision does
only good, but a few programs rely on the precise definition of
IEEE floating point. Use -ffloat-store for such programs, after
modifying them to store all pertinent intermediate computations
into variables.
<小时>
在 x86_64 版本中,编译器默认为 float
和 double
使用 SSE 寄存器,因此不使用扩展精度并且不会发生此问题.
In x86_64 builds compilers use SSE registers for float
and double
by default, so that no extended precision is used and this issue doesn't occur.
gcc
编译器选项 -mfpmath
控制.
gcc
compiler option -mfpmath
controls that.
相关文章