比较将整数值转换为字符串的 3 种现代 C++ 方法

2021-12-24 00:00:00 c++ c++11 stl boost

我试图选择一个 将积分转换为字符串的标准方法,所以我继续通过 性能评估/a/21995693/2567683">测量3种方法的执行时间

#include #include <字符串>#include #include <向量>#include <chrono>#include <随机>#include <例外>#include #include 使用命名空间标准;//1. 一种轻松测量经过时间的方法 -------------------模板结构测量{模板静态类型名 TimeT::rep 执行(F const &func){自动启动 = std::chrono::system_clock::now();功能();自动持续时间 = std::chrono::duration_cast<时间T>(std::chrono::system_clock::now() - 开始);返回持续时间.计数();}};//-----------------------------------------------------------//2. 定义转换函数 ========================模板//A. 使用字符串流 ================字符串 StringFromNumber_SS(T const &value) {字符串流 SS;ss<<价值;返回 ss.str();}模板//B. 使用 boost::lexical_cast ==========字符串 StringFromNumber_LC(T const &value) {返回 boost::lexical_cast(value);}模板//C. 使用 c++11 to_string() ============字符串 StringFromNumber_C11(T const &value) {返回 std::to_string(value);}//============================================================//3. 测量不同执行的包装器 ----------templatelong long MeasureExec(std::vector const &v1, F const &func){返回测量<>::execution([&]() {for (auto const &i : v1) {if (func(i) != StringFromNumber_LC(i)) {抛出 std::runtime_error(失败");}}});}//-----------------------------------------------------------//4. 将随机数生成为向量的机器 -----模板类型名称 std::enable_if<std::is_integral<T>::value>::typeFillVec(vector<T> &v){std::mt19937 e2(1);std::uniform_int_distribution<>dist(3, 1440);std::generate(v.begin(), v.end(), [&]() { return dist(e2); });}模板类型名称 std::enable_if<!std::is_integral<T>::value>::typeFillVec(vector<T> &v){std::mt19937 e2(1);std::uniform_real_distribution<>dist(-1440., 1440.);std::generate(v.begin(), v.end(), [&]() { return dist(e2); });}//-----------------------------------------------------------int main(){std::vectorv1(991908);FillVec(v1);cout<<"C++ 11 方法......"<<MeasureExec(v1, StringFromNumber_C11)<<结束;cout<<"字符串流方法.."<<MeasureExec(v1, StringFromNumber_SS)<<结束;cout<<"词法转换方法......"<<MeasureExec(v1, StringFromNumber_LC)<<结束;返回0;}

典型输出(在 VS2013 中运行 Release 意味着/O2 优化标志)将是

<块引用>

C++ 11 方法 ..................... 273

<块引用>

字符串流方法.. 1923

<块引用>

词法转换方法... 222

更新

或者在gcc上在线运行

g++ -std=c++11 -Ofast -march=native -Wall -pedantic main.cpp &&./a.out

<块引用>

C++ 11 方法 ..................... 414

<块引用>

字符串流方法.. 1538

<块引用>

词法转换方法... 275

免责声明:结果将在彼此之间进行比较,而不是跨机器进行比较

问题

1.为什么字符串流方法始终是最差的(一个数量级)?既然出现了更快的替代方案,它是否应该被视为已弃用?

2.为什么词法转换始终是最好的?我们可以假设这是最快的实现吗?

请随意调整和使用您的代码版本.感谢您对此主题的见解.

PS

实际运行的代码,每个 main() 只有一个度量.为了节省空间,这里将 3 个放在一起.

优化标志是编译器特定的或应用程序要求的.我只是提供代码块来执行测试,并期望 SO 用户提供他们的结果或建议,以了解每个编译器的最佳配置(我提供了此处使用的标志).

该代码适用于任何数字到字符串的转换(需要更改 main 中的 v1 类型).sehe 为 double 做了(在他的回答评论中提到).玩这个也是个好主意.

解决方案

问题 1.为什么字符串流方法一直是最差的?

经典错误:每次都创建一个新的字符串流

template//1. 使用字符串流字符串 StringFromIntegral_SS(T const &value) {thread_local stringstream ss;ss.str("");ss.clear();ss<<价值;返回 ss.str();}

<块引用>

问题 2.为什么词法转换始终是最好的?我们可以假设这是最快的实现吗?

因为它是最专业的;而且,不,存在更快的实现.据我所知,FastFormat 和 Boost Spirit 的产品具有竞争力.

更新 Boost Spirit Karma 仍然轻松击败其他人:

template//4. Karma 到字符串std::string StringFromIntegral_K(T const &value) {thread_local auto const gen = boost::spirit::traits::create_generator::call();thread_local char buf[20];char* it = buf;boost::spirit::karma::generate(it, gen, value);返回 std::string(buf, it);}

时间:

C++ 11 方法 111字符串流方法 103词法转换方法 57灵缘法36带有 string_ref 13 的 Spirit Karma 方法

查看生活在 Coliru Clang 或 海湾合作委员会

<小时>

奖金

顺便说一句,由于减少了分配,使用 boost::string_ref 的版本仍然要快得多:

template//5. Karma 到 string_refboost::string_ref StringFromIntegral_KSR(T const &value) {thread_local auto const gen = boost::spirit::traits::create_generator::call();thread_local char buf[20];char* it = buf;boost::spirit::karma::generate(it, gen, value);返回 boost::string_ref(buf, it-buf);}

我已经使用断言测试循环测试了所有修改过的方法的正确性:

返回度量<>::execution(//[&]() { for (auto const &i : v1) { func(i);}});[&]() { for (auto const &i : v1) { assert(func(i) == StringFromIntegral_LC(i));}});

I was trying to pick a standard way to convert integrals to strings, so I went on and did a small performance evaluation by measuring the execution time of 3 methods

#include <iostream>
#include <string>
#include <sstream>
#include <vector>
#include <chrono>
#include <random>
#include <exception>
#include <type_traits>
#include <boost/lexical_cast.hpp>

using namespace std;

// 1. A way to easily measure elapsed time -------------------
template<typename TimeT = std::chrono::milliseconds>
struct measure
{
    template<typename F>
    static typename TimeT::rep execution(F const &func)
    {
        auto start = std::chrono::system_clock::now();
        func();
        auto duration = std::chrono::duration_cast< TimeT>(
            std::chrono::system_clock::now() - start);
        return duration.count();
    }
};
// -----------------------------------------------------------

// 2. Define the conversion functions ========================
template<typename T> // A. Using stringstream ================
string StringFromNumber_SS(T const &value) {
    stringstream ss;
    ss << value;
    return ss.str();
}

template<typename T> // B. Using boost::lexical_cast =========
string StringFromNumber_LC(T const &value) {
    return boost::lexical_cast<string>(value);
}

template<typename T> // C. Using c++11 to_string() ===========
string StringFromNumber_C11(T const &value) {
    return std::to_string(value);
}
// ===========================================================

// 3. A wrapper to measure the different executions ----------
template<typename T, typename F>
long long MeasureExec(std::vector<T> const &v1, F const &func)
{
    return measure<>::execution([&]() {
        for (auto const &i : v1) {
            if (func(i) != StringFromNumber_LC(i)) {
                throw std::runtime_error("FAIL");
            }
        }
    });
}
// -----------------------------------------------------------

// 4. Machinery to generate random numbers into a vector -----
template<typename T>
typename std::enable_if<std::is_integral<T>::value>::type 
FillVec(vector<T> &v)
{
    std::mt19937 e2(1);
    std::uniform_int_distribution<> dist(3, 1440);
    std::generate(v.begin(), v.end(), [&]() { return dist(e2); });
}

template<typename T>
typename std::enable_if<!std::is_integral<T>::value>::type 
FillVec(vector<T> &v)
{
    std::mt19937 e2(1);
    std::uniform_real_distribution<> dist(-1440., 1440.);
    std::generate(v.begin(), v.end(), [&]() { return dist(e2); });
}
// -----------------------------------------------------------

int main()
{
    std::vector<int> v1(991908);
    FillVec(v1);

    cout << "C++ 11 method ......... " <<
        MeasureExec(v1, StringFromNumber_C11<int>) << endl;
    cout << "String stream method .. " <<
        MeasureExec(v1, StringFromNumber_SS<int>) << endl;
    cout << "Lexical cast method ... " <<
        MeasureExec(v1, StringFromNumber_LC<int>) << endl;

    return 0;
}

A typical output (running Release in VS2013 which implies /O2 optimization flag) would be

C++ 11 method ......... 273

String stream method .. 1923

Lexical cast method ... 222

UPDATE

Alternatively an online run on gcc with

g++ -std=c++11 -Ofast -march=native -Wall -pedantic main.cpp && ./a.out

C++ 11 method ......... 414

String stream method .. 1538

Lexical cast method ... 275

Disclaimer : Results are to be compared among each other and not across machines

Questions

1. Why is the string stream method consistently the worst (by an order of magnitude)? Should it be viewed as deprecated now that faster alternatives emerged?

2. Why is lexical cast consistently the best? Can we assume that this is the fastest implementation?

Please feel free to tweak and play with your versions of this code. I'd appreciate your insights on the topic.

PS

The code that was actually run, had only one measurement per main(). Here all were 3 were presented together to save space.

Optimization flags are compiler specific or application mandated. I'm just providing the code blocks to perform the tests and expect from SO users to chip in with their results or suggestions to what the optimum configuration per compiler would be (for what it's worth I provided the flags used here).

The code works for any numeric to string conversion (it takes changing the type of v1 in main). sehe did for double (mentioned in his answer's comment). It's a good idea to play with that too.

解决方案

Question 1. Why is the string stream method consistently the worst?

The classical mistake: creating a new stringstream every single time

template<typename T> // 1. Using stringstream
string StringFromIntegral_SS(T const &value) {
    thread_local stringstream ss;
    ss.str("");
    ss.clear();
    ss << value;
    return ss.str();
}

Question 2. Why is lexical cast consistently the best? Can we assume that this is the fastest implementation ?

Because it's most specialized; and, no, faster implementations exist. FastFormat and Boost Spirit have competitive offerings, as far as I know.

Update Boost Spirit Karma still easily beats the bunch:

template<typename T> // 4. Karma to string
std::string StringFromIntegral_K(T const &value) {
    thread_local auto const gen = boost::spirit::traits::create_generator<T>::call();
    thread_local char buf[20];
    char* it = buf;
    boost::spirit::karma::generate(it, gen, value);
    return std::string(buf, it);
}

Timings:

C++ 11 method 111
String stream method 103
Lexical cast method 57
Spirit Karma method 36
Spirit Karma method with string_ref 13

See it Live On Coliru Clang or GCC


BONUS

Just to goof off, a version using boost::string_ref is much faster still due the reduced allocations:

template<typename T> // 5. Karma to string_ref
boost::string_ref StringFromIntegral_KSR(T const &value) {
    thread_local auto const gen = boost::spirit::traits::create_generator<T>::call();
    thread_local char buf[20];
    char* it = buf;
    boost::spirit::karma::generate(it, gen, value);
    return boost::string_ref(buf, it-buf);
}

I've tested all modified methods for correctness using an asserting test loop:

return measure<>::execution(
    //[&]() { for (auto const &i : v1) { func(i); }});
    [&]() { for (auto const &i : v1) { assert(func(i) == StringFromIntegral_LC(i)); }});

相关文章