原始双精度类型比较的 GCC 问题

2022-01-25 00:00:00 gcc compare double c++

我有以下代码,但是在使用带有各种优化标志的 GCC 4.4 编译它时,我在运行时得到了一些意想不到的结果.

I have the following bit of code, however when compiling it with GCC 4.4 with various optimization flags I get some unexpected results when its run.

#include <iostream>

int main()
{
   const unsigned int cnt = 10;
   double lst[cnt] = { 0.0 };
   const double v[4] = { 131.313, 737.373, 979.797, 731.137 };

   for(unsigned int i = 0; i < cnt; ++i) {
      lst[i] = v[i % 4] * i;
   }

   for(unsigned int i = 0; i < cnt; ++i) {
      double d = v[i % 4] * i;
      if(lst[i] != d) {
         std::cout << "error @ : " << i << std::endl;
         return 1;
      }
   }
   return 0;
}

  • 编译时使用:"g++ -pedantic -Wall -Werror -O1 -o test test.cpp" 我得到以下输出:error @: 3"

    • when compiled with: "g++ -pedantic -Wall -Werror -O1 -o test test.cpp" I get the following output: "error @ : 3"

      编译时使用:"g++ -pedantic -Wall -Werror -O2 -o test test.cpp" 我得到以下输出:error @: 3"

      when compiled with: "g++ -pedantic -Wall -Werror -O2 -o test test.cpp" I get the following output: "error @ : 3"

      编译时使用:"g++ -pedantic -Wall -Werror -O3 -o test test.cpp"我没有收到任何错误

      when compiled with: "g++ -pedantic -Wall -Werror -O3 -o test test.cpp" I get no errors

      编译时使用:"g++ -pedantic -Wall -Werror -o test test.cpp"我没有收到任何错误

      when compiled with: "g++ -pedantic -Wall -Werror -o test test.cpp" I get no errors

      我不认为这是与四舍五入或比较中的 epsilon 差异有关的问题.我已经使用 Intel v10 和 MSVC 9.0 进行了尝试,它们似乎都按预期工作.我相信这应该只是按位比较.

      I do not believe this to be an issue related to rounding, or epsilon difference in the comparison. I've tried this with Intel v10 and MSVC 9.0 and they all seem to work as expected. I believe this should be nothing more than a bitwise compare.

      如果我将 if 语句替换为以下内容:if (static_cast<long long int>(lst[i]) != static_cast<long long int>(d)),并添加-Wno-long-long",我在运行时在任何优化模式下都不会出错.

      If I replace the if-statement with the following: if (static_cast<long long int>(lst[i]) != static_cast<long long int>(d)), and add "-Wno-long-long" I get no errors in any of the optimization modes when run.

      如果我添加 std::cout <<d < 在return 1"之前,我在运行时在任何优化模式下都没有错误.

      If I add std::cout << d << std::endl; before the "return 1", I get no errors in any of the optimization modes when run.

      这是我的代码中的错误,还是 GCC 及其处理双精度类型的方式有问题?

      Is this a bug in my code, or is there something wrong with GCC and the way it handles the double type?

      注意:我刚刚用 gcc 4.3 和 3.3 版本尝试过这个,没有出现错误.

      Note: I've just tried this with gcc versions 4.3 and 3.3, the error is not exhibited.

      解决方案: Mike Dinsdale 注意到以下错误报告:http://gcc.gnu.org/bugzilla/show_bug.cgi?id=323 看来 GCC 团队对问题的性质并没有完全确定.

      Resolution: Mike Dinsdale noted the following bug report: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=323 It seems the GCC team are not completely sure about nature of problem.

      正如错误报告中所建议的,可能的解决方案是使用 ffloat-store 选项.我已经尝试过了,它可以工作,但是从性能的角度来看,结果并不是那么好,尽管 ymmv.

      As suggested in the bug report a possible resolution is to use the ffloat-store option. I've tried this and it works, however the results from a performance point of view are not that great, though ymmv.

      推荐答案

      结果取决于优化设置这一事实表明,它可能是 x87 扩展精度造成的问题(正如 Michael Burr 所说).

      The fact that the result depends on the optimization settings suggests it might be the x87 extended precision messing with things (as Michael Burr says).

      这是我用来关闭扩展精度的一些代码(在 x86 处理器上使用 gcc):

      Here's some code I use (with gcc on x86 processors) to switch the extended precision off:

      static const unsigned int PRECISION_BIT_MASK = 0x300;
      ///< bitmask to mask out all non-precision bits in the fpu control word cite{INTEL}.
      static const unsigned int EXTENDED_PRECISION_BITS = 0x300;
      ///< add to the fpu control word (after zeroing precision bits) to turn on extended precision cite{INTEL}.
      static const unsigned int STANDARD_PRECISION_BITS = 0x200;
      ///< add to the fpu control word (after zeroing precision bits) to turn off extended precision cite{INTEL}.
      
      void set_fpu_control_word(unsigned int mode)
      {
        asm ("fldcw %0" : : "m" (*&mode));
      }
      
      unsigned int get_fpu_control_word()
      {
        volatile unsigned int mode = 0;
        asm ("fstcw %0" : "=m" (*&mode));
        return mode;
      }
      
      bool fpu_set_extended_precision_is_on(bool state)
      {
        unsigned int old_cw = get_fpu_control_word();
        unsigned int masked = old_cw & ~PRECISION_BIT_MASK;
        unsigned int new_cw;
        if(state)
          new_cw = masked + EXTENDED_PRECISION_BITS;
        else
          new_cw = masked + STANDARD_PRECISION_BITS;
        set_fpu_control_word(new_cw);
        return true;
      }
      
      bool fpu_get_extended_precision_is_on()
      {
        unsigned int old_cw = get_fpu_control_word();
        return  ((old_cw & PRECISION_BIT_MASK) == 0x300);
      }
      

      或者你可以只用 valgrind 运行你的代码,它不模拟 80 位寄存器,对于这样的短程序来说可能更容易!

      Or you can just run your code with valgrind, which doesn't simulate the 80-bit registers, and is probably easier for a short program like this!

相关文章