double 或 float,哪个更快?

2022-01-09 00:00:00 floating-point double c++

我正在阅读加速 C++".我发现一句话说有时 double 在执行中比 C++ 中的 float 更快".读完句子后,我对 floatdouble 的工作感到困惑.请向我解释这一点.

I am reading "accelerated C++". I found one sentence which states "sometimes double is faster in execution than float in C++". After reading sentence I got confused about float and double working. Please explain this point to me.

推荐答案

取决于原生硬件的功能.

Depends on what the native hardware does.

  • 如果硬件是(或类似)具有传统 x87 数学的 x86,float 和 double 都(免费)扩展为内部 80 位格式,因此两者具有相同的性能(缓存占用空间除外/内存带宽)

  • If the hardware is (or is like) x86 with legacy x87 math, float and double are both extended (for free) to an internal 80-bit format, so both have the same performance (except for cache footprint / memory bandwidth)

如果硬件像大多数现代 ISA(包括 x86-64,其中 SSE2 是标量 FP 数学的默认值)一样本机实现两者,那么通常大多数 FPU 操作对两者的速度相同.双除法和sqrt可能比float慢,当然也比乘法或加法慢得多.(浮点数越小意味着缓存未命中越少.使用 SIMD,每个向量的元素数量是向量化循环的两倍).

If the hardware implements both natively, like most modern ISAs (including x86-64 where SSE2 is the default for scalar FP math), then usually most FPU operations are the same speed for both. Double division and sqrt can be slower than float, as well as of course being significantly slower than multiply or add. (Float being smaller can mean fewer cache misses. And with SIMD, twice as many elements per vector for loops that vectorize).

如果硬件只实现了 double,那么在 float-load 和 float-store 指令中与本机 double 格式之间的转换不是免费的,那么 float 会更慢.

If the hardware implements only double, then float will be slower if conversion to/from the native double format isn't free as part of float-load and float-store instructions.

如果硬件只实现 float,那么用它来模拟 double 将花费更多时间.在这种情况下,浮动会更快.

If the hardware implements float only, then emulating double with it will cost even more time. In this case, float will be faster.

如果硬件都没有实现,那么两者都必须在软件中实现.在这种情况下,两者都会很慢,但 double 会稍微慢一些(至少有更多的加载和存储操作).

And if the hardware implements neither, and both have to be implemented in software. In this case, both will be slow, but double will be slightly slower (more load and store operations at the least).

您提到的引用可能是指 x86 平台,其中给出了第一个案例 .但这通常并不成立.

The quote you mention is probably referring to the x86 platform, where the first case was given. But this doesn't hold true in general.

还要注意,浮点 x,y 的 x * 3.3 + y 将触发两个变量的提升加倍.这不是硬件的问题,你应该通过编写 3.3f 来避免它,让你的编译器生成有效的 asm,如果你想要的话,它实际上将数字保持为浮点数.

Also beware that x * 3.3 + y for float x,y will trigger promotion to double for both variables. This is not the hardware's fault, and you should avoid it by writing 3.3f to let your compiler make efficient asm that actually keeps numbers as floats if that's what you want.

相关文章