获取 sqrt(n) 整数部分的最快方法?
我们知道如果 n
不是一个完美的正方形,那么 sqrt(n)
就不是一个整数.由于我只需要整数部分,我觉得调用 sqrt(n)
不会那么快,因为计算小数部分也需要时间.
As we know if n
is not a perfect square, then sqrt(n)
would not be an integer. Since I need only the integer part, I feel that calling sqrt(n)
wouldn't be that fast, as it takes time to calculate the fractional part also.
所以我的问题是,
我们可以只得到sqrt(n)的整数部分而不计算sqrt(n)
的实际值吗?算法应该比sqrt(n)
(在
或
中定义)更快?
Can we get only the integer part of sqrt(n) without calculating the actual value of sqrt(n)
? The algorithm should be faster than sqrt(n)
(defined in <math.h>
or <cmath>
)?
如果可能,您也可以在 asm
块中编写代码.
If possible, you can write the code in asm
block also.
推荐答案
我会尝试快速逆平方根技巧.
这是一种在没有任何分支的情况下获得 1/sqrt(n)
的非常好的近似值的方法,基于一些位操作,因此不可移植(特别是在 32 位和 64 位平台之间)).
It's a way to get a very good approximation of 1/sqrt(n)
without any branch, based on some bit-twiddling so not portable (notably between 32-bits and 64-bits platforms).
一旦你得到它,你只需要把结果取反,取整数部分.
Once you get it, you just need to inverse the result, and takes the integer part.
当然,可能有更快的技巧,因为这个技巧有点绕.
There might be faster tricks, of course, since this one is a bit of a round about.
编辑:让我们做吧!
先来个小帮手:
// benchmark.h
#include <sys/time.h>
template <typename Func>
double benchmark(Func f, size_t iterations)
{
f();
timeval a, b;
gettimeofday(&a, 0);
for (; iterations --> 0;)
{
f();
}
gettimeofday(&b, 0);
return (b.tv_sec * (unsigned int)1e6 + b.tv_usec) -
(a.tv_sec * (unsigned int)1e6 + a.tv_usec);
}
然后是主体:
#include <iostream>
#include <cmath>
#include "benchmark.h"
class Sqrt
{
public:
Sqrt(int n): _number(n) {}
int operator()() const
{
double d = _number;
return static_cast<int>(std::sqrt(d) + 0.5);
}
private:
int _number;
};
// http://www.codecodex.com/wiki/Calculate_an_integer_square_root
class IntSqrt
{
public:
IntSqrt(int n): _number(n) {}
int operator()() const
{
int remainder = _number;
if (remainder < 0) { return 0; }
int place = 1 <<(sizeof(int)*8 -2);
while (place > remainder) { place /= 4; }
int root = 0;
while (place)
{
if (remainder >= root + place)
{
remainder -= root + place;
root += place*2;
}
root /= 2;
place /= 4;
}
return root;
}
private:
int _number;
};
// http://en.wikipedia.org/wiki/Fast_inverse_square_root
class FastSqrt
{
public:
FastSqrt(int n): _number(n) {}
int operator()() const
{
float number = _number;
float x2 = number * 0.5F;
float y = number;
long i = *(long*)&y;
//i = (long)0x5fe6ec85e7de30da - (i >> 1);
i = 0x5f3759df - (i >> 1);
y = *(float*)&i;
y = y * (1.5F - (x2*y*y));
y = y * (1.5F - (x2*y*y)); // let's be precise
return static_cast<int>(1/y + 0.5f);
}
private:
int _number;
};
int main(int argc, char* argv[])
{
if (argc != 3) {
std::cerr << "Usage: %prog integer iterations
";
return 1;
}
int n = atoi(argv[1]);
int it = atoi(argv[2]);
assert(Sqrt(n)() == IntSqrt(n)() &&
Sqrt(n)() == FastSqrt(n)() && "Different Roots!");
std::cout << "sqrt(" << n << ") = " << Sqrt(n)() << "
";
double time = benchmark(Sqrt(n), it);
double intTime = benchmark(IntSqrt(n), it);
double fastTime = benchmark(FastSqrt(n), it);
std::cout << "Number iterations: " << it << "
"
"Sqrt computation : " << time << "
"
"Int computation : " << intTime << "
"
"Fast computation : " << fastTime << "
";
return 0;
}
结果:
sqrt(82) = 9
Number iterations: 4096
Sqrt computation : 56
Int computation : 217
Fast computation : 119
// Note had to tweak the program here as Int here returns -1 :/
sqrt(2147483647) = 46341 // real answer sqrt(2 147 483 647) = 46 340.95
Number iterations: 4096
Sqrt computation : 57
Int computation : 313
Fast computation : 119
正如预期的那样,Fast 计算的性能比 Int 计算要好得多.
Where as expected the Fast computation performs much better than the Int computation.
哦,顺便说一句,sqrt
更快:)
Oh, and by the way, sqrt
is faster :)
相关文章