std::bind 与 lambda 性能
我想为一些函数的执行计时,我给自己写了一个助手:
I wanted to time a few functions' execution and I've written myself a helper:
using namespace std;
template<int N = 1, class Fun, class... Args>
void timeExec(string name, Fun fun, Args... args) {
auto start = chrono::steady_clock::now();
for(int i = 0; i < N; ++i) {
fun(args...);
}
auto end = chrono::steady_clock::now();
auto diff = end - start;
cout << name << ": "<< chrono::duration<double, milli>(diff).count() << " ms. << endl;
}
我认为,对于以这种方式计时成员函数,我必须使用 bind 或 lambda,我想看看哪个对性能的影响较小,所以我这样做了:
I figured that for timing member functions this way I'd have to use bind or lambda and I wanted to see which would impact the performance less, so I did:
const int TIMES = 10000;
timeExec<TIMES>("Bind evaluation", bind(&decltype(result)::eval, &result));
timeExec<1>("Lambda evaluation", [&]() {
for(int i = 0; i < TIMES; ++i) {
result.eval();
}
});
结果是:
Bind evaluation: 0.355158 ms.
Lambda evaluation: 0.014414 ms.
我不知道内部原理,但我认为 lambda 不会比绑定更好.我能想到的唯一合理的解释是编译器优化了 lambda 循环中的后续函数评估.
I don't know the internals, but I assume that lambda cannot be that better than bind. The only plausible explanation I can think of is the compiler optimizing-out subsequent function evaluations in the lambda's loop.
你会如何解释?
推荐答案
我认为 lambda 不会比 bind 好.
I assume that lambda cannot be that better than bind.
这是一个先入之见.
Lambda 与编译器内部结构相关联,因此可能会发现额外的优化机会.此外,它们旨在避免效率低下.
Lambdas are tied into the compiler internals, so extra optimization opportunities may be found. Moreover, they're designed to avoid inefficiency.
然而,这里可能没有编译器优化技巧.可能的罪魁祸首是绑定的参数,bind(&decltype(result)::eval, &result)
.您正在传递一个指向成员函数的指针 (PTMF) 和一个对象.与 lambda 类型不同,PTMF 不捕获实际调用的函数;它只包含函数签名(参数和返回类型).慢循环使用的是间接分支函数调用,因为编译器无法通过常量传播解析函数指针.
However, there are probably no compiler optimization tricks happening here. The likely culprit is the argument to bind, bind(&decltype(result)::eval, &result)
. You are passing a pointer-to-member-function (PTMF) and an object. Unlike the lambda type, the PTMF does not capture what function actually gets called; it only contains the function signature (parameter and return types). The slow loop is using an indirect branch function call, because the compiler failed to resolve the function pointer through constant propagation.
如果将成员eval()
重命名为operator()()
并去掉bind
,那么显式对象本质上将表现得像 lambda,性能差异应该消失.
If you rename the member eval()
to operator () ()
and get rid of bind
, then the explicit object will essentially behave like the lambda and the performance difference should disappear.
相关文章