什么更有效?使用 pow 平方或只是乘以它自己?
这两种方法在C中哪个更有效?以及如何:
What of these two methods is in C more efficient? And how about:
pow(x,3)
对比
x*x*x // etc?
推荐答案
UPDATE 2021
我修改了基准代码如下:
I've modified the benchmark code as follows:
- std::chrono 用于计时测量而不是 boost
- C++11
代替rand()
- 避免重复操作可能会被吊出.基本参数不断变化.
我使用 GCC 10 -O2 得到以下结果(以秒为单位):
I get the following results with GCC 10 -O2 (in seconds):
exp c++ pow c pow x*x*x...
2 0.204243 1.39962 0.0902527
3 1.36162 1.38291 0.107679
4 1.37717 1.38197 0.106103
5 1.3815 1.39139 0.117097
GCC 10 -O3 几乎与 GCC 10 -O2 相同.
GCC 10 -O3 is almost identical to GCC 10 -O2.
使用 GCC 10 -O2 -ffast-math:
With GCC 10 -O2 -ffast-math:
exp c++ pow c pow x*x*x...
2 0.203625 1.4056 0.0913414
3 0.11094 1.39938 0.108027
4 0.201593 1.38618 0.101585
5 0.102141 1.38212 0.10662
使用 GCC 10 -O3 -ffast-math:
With GCC 10 -O3 -ffast-math:
exp c++ pow c pow x*x*x...
2 0.0451995 1.175 0.0450497
3 0.0470842 1.20226 0.051399
4 0.0475239 1.18033 0.0473844
5 0.0522424 1.16817 0.0522291
使用 Clang 12 -O2:
With Clang 12 -O2:
exp c++ pow c pow x*x*x...
2 0.106242 0.105435 0.105533
3 1.45909 1.4425 0.102235
4 1.45629 1.44262 0.108861
5 1.45837 1.44483 0.1116
Clang 12 -O3 几乎与 Clang 12 -O2 相同.
Clang 12 -O3 is almost identical to Clang 12 -O2.
使用 Clang 12 -O2 -ffast-math:
With Clang 12 -O2 -ffast-math:
exp c++ pow c pow x*x*x...
2 0.0233731 0.0232457 0.0231076
3 0.0271074 0.0266663 0.0278415
4 0.026897 0.0270698 0.0268115
5 0.0312481 0.0296402 0.029811
Clang 12 -O3 -ffast-math 几乎与 Clang 12 -O2 -ffast-math 相同.
Clang 12 -O3 -ffast-math is almost identical to Clang 12 -O2 -ffast-math.
机器是 Linux 5.4.0-73-generic x86_64 上的 Intel Core i7-7700K.
Machine is Intel Core i7-7700K on Linux 5.4.0-73-generic x86_64.
结论:
- 使用 GCC 10(无 -ffast-math),
x*x*x...
总是更快 - 使用 GCC 10 -O2 -ffast-math,
std::pow
和x*x*x...
对于odd 一样快em> 指数 - 使用 GCC 10 -O3 -ffast-math,对于所有测试用例,
std::pow
与x*x*x...
一样快,并且是大约是 -O2 的两倍. - 使用 GCC 10,C 的
pow(double, double)
总是慢得多 - 使用 Clang 12(无 -ffast-math),
x*x*x...
对于大于 2 的指数会更快 - 使用 Clang 12 -ffast-math,所有方法都会产生相似的结果
- 在 Clang 12 中,
pow(double, double)
与std::pow
对于整数指数一样快 - 在没有让编译器比你聪明的情况下编写基准测试是困难的.
- With GCC 10 (no -ffast-math),
x*x*x...
is always faster - With GCC 10 -O2 -ffast-math,
std::pow
is as fast asx*x*x...
for odd exponents - With GCC 10 -O3 -ffast-math,
std::pow
is as fast asx*x*x...
for all test cases, and is around twice as fast as -O2. - With GCC 10, C's
pow(double, double)
is always much slower - With Clang 12 (no -ffast-math),
x*x*x...
is faster for exponents greater than 2 - With Clang 12 -ffast-math, all methods produce similar results
- With Clang 12,
pow(double, double)
is as fast asstd::pow
for integral exponents - Writing benchmarks without having the compiler outsmart you is hard.
我最终会在我的机器上安装更新版本的 GCC,并在我这样做时更新我的??结果.
I'll eventually get around to installing a more recent version of GCC on my machine and will update my results when I do so.
这是更新的基准代码:
#include <cmath>
#include <chrono>
#include <iostream>
#include <random>
using Moment = std::chrono::high_resolution_clock::time_point;
using FloatSecs = std::chrono::duration<double>;
inline Moment now()
{
return std::chrono::high_resolution_clock::now();
}
#define TEST(num, expression)
double test##num(double b, long loops)
{
double x = 0.0;
auto startTime = now();
for (long i=0; i<loops; ++i)
{
x += expression;
b += 1.0;
}
auto elapsed = now() - startTime;
auto seconds = std::chrono::duration_cast<FloatSecs>(elapsed);
std::cout << seconds.count() << " ";
return x;
}
TEST(2, b*b)
TEST(3, b*b*b)
TEST(4, b*b*b*b)
TEST(5, b*b*b*b*b)
template <int exponent>
double testCppPow(double base, long loops)
{
double x = 0.0;
auto startTime = now();
for (long i=0; i<loops; ++i)
{
x += std::pow(base, exponent);
base += 1.0;
}
auto elapsed = now() - startTime;
auto seconds = std::chrono::duration_cast<FloatSecs>(elapsed);
std::cout << seconds.count() << " ";
return x;
}
double testCPow(double base, double exponent, long loops)
{
double x = 0.0;
auto startTime = now();
for (long i=0; i<loops; ++i)
{
x += ::pow(base, exponent);
base += 1.0;
}
auto elapsed = now() - startTime;
auto seconds = std::chrono::duration_cast<FloatSecs>(elapsed);
std::cout << seconds.count() << " ";
return x;
}
int main()
{
using std::cout;
long loops = 100000000l;
double x = 0;
std::random_device rd;
std::default_random_engine re(rd());
std::uniform_real_distribution<double> dist(1.1, 1.2);
cout << "exp c++ pow c pow x*x*x...";
cout << "
2 ";
double b = dist(re);
x += testCppPow<2>(b, loops);
x += testCPow(b, 2.0, loops);
x += test2(b, loops);
cout << "
3 ";
b = dist(re);
x += testCppPow<3>(b, loops);
x += testCPow(b, 3.0, loops);
x += test3(b, loops);
cout << "
4 ";
b = dist(re);
x += testCppPow<4>(b, loops);
x += testCPow(b, 4.0, loops);
x += test4(b, loops);
cout << "
5 ";
b = dist(re);
x += testCppPow<5>(b, loops);
x += testCPow(b, 5.0, loops);
x += test5(b, loops);
std::cout << "
" << x << "
";
}
旧答案,2010 年
我使用此代码测试了 x*x*...
与 pow(x,i)
对于小型 i
之间的性能差异:
I tested the performance difference between x*x*...
vs pow(x,i)
for small i
using this code:
#include <cstdlib>
#include <cmath>
#include <boost/date_time/posix_time/posix_time.hpp>
inline boost::posix_time::ptime now()
{
return boost::posix_time::microsec_clock::local_time();
}
#define TEST(num, expression)
double test##num(double b, long loops)
{
double x = 0.0;
boost::posix_time::ptime startTime = now();
for (long i=0; i<loops; ++i)
{
x += expression;
x += expression;
x += expression;
x += expression;
x += expression;
x += expression;
x += expression;
x += expression;
x += expression;
x += expression;
}
boost::posix_time::time_duration elapsed = now() - startTime;
std::cout << elapsed << " ";
return x;
}
TEST(1, b)
TEST(2, b*b)
TEST(3, b*b*b)
TEST(4, b*b*b*b)
TEST(5, b*b*b*b*b)
template <int exponent>
double testpow(double base, long loops)
{
double x = 0.0;
boost::posix_time::ptime startTime = now();
for (long i=0; i<loops; ++i)
{
x += std::pow(base, exponent);
x += std::pow(base, exponent);
x += std::pow(base, exponent);
x += std::pow(base, exponent);
x += std::pow(base, exponent);
x += std::pow(base, exponent);
x += std::pow(base, exponent);
x += std::pow(base, exponent);
x += std::pow(base, exponent);
x += std::pow(base, exponent);
}
boost::posix_time::time_duration elapsed = now() - startTime;
std::cout << elapsed << " ";
return x;
}
int main()
{
using std::cout;
long loops = 100000000l;
double x = 0.0;
cout << "1 ";
x += testpow<1>(rand(), loops);
x += test1(rand(), loops);
cout << "
2 ";
x += testpow<2>(rand(), loops);
x += test2(rand(), loops);
cout << "
3 ";
x += testpow<3>(rand(), loops);
x += test3(rand(), loops);
cout << "
4 ";
x += testpow<4>(rand(), loops);
x += test4(rand(), loops);
cout << "
5 ";
x += testpow<5>(rand(), loops);
x += test5(rand(), loops);
cout << "
" << x << "
";
}
结果是:
1 00:00:01.126008 00:00:01.128338
2 00:00:01.125832 00:00:01.127227
3 00:00:01.125563 00:00:01.126590
4 00:00:01.126289 00:00:01.126086
5 00:00:01.126570 00:00:01.125930
2.45829e+54
请注意,我累积了每次 pow 计算的结果,以确保编译器不会对其进行优化.
Note that I accumulate the result of every pow calculation to make sure the compiler doesn't optimize it away.
如果我使用 std::pow(double, double)
版本,并且 loops = 1000000l
,我得到:
If I use the std::pow(double, double)
version, and loops = 1000000l
, I get:
1 00:00:00.011339 00:00:00.011262
2 00:00:00.011259 00:00:00.011254
3 00:00:00.975658 00:00:00.011254
4 00:00:00.976427 00:00:00.011254
5 00:00:00.973029 00:00:00.011254
2.45829e+52
这是在运行 Ubuntu 9.10 64 位的 Intel Core Duo 上.使用带有 -o2 优化的 gcc 4.4.1 编译.
This is on an Intel Core Duo running Ubuntu 9.10 64bit. Compiled using gcc 4.4.1 with -o2 optimization.
所以在 C 中,是的 x*x*x
会比 pow(x, 3)
快,因为没有 pow(double, int)
重载.在 C++ 中,它大致相同.(假设我的测试方法是正确的.)
So in C, yes x*x*x
will be faster than pow(x, 3)
, because there is no pow(double, int)
overload. In C++, it will be the roughly same. (Assuming the methodology in my testing is correct.)
这是对 An Markm 的评论的回应:
This is in response to the comment made by An Markm:
即使发出了 using namespace std
指令,如果 pow
的第二个参数是 int
,那么 std::pow(double, int)
来自 <cmath>
的重载将被调用,而不是来自 < 的
.::pow(double, double)
;math.h>
Even if a using namespace std
directive was issued, if the second parameter to pow
is an int
, then the std::pow(double, int)
overload from <cmath>
will be called instead of ::pow(double, double)
from <math.h>
.
此测试代码确认了该行为:
This test code confirms that behavior:
#include <iostream>
namespace foo
{
double bar(double x, int i)
{
std::cout << "foo::bar
";
return x*i;
}
}
double bar(double x, double y)
{
std::cout << "::bar
";
return x*y;
}
using namespace foo;
int main()
{
double a = bar(1.2, 3); // Prints "foo::bar"
std::cout << a << "
";
return 0;
}
相关文章