C++ 性能挑战:整数到 std::string 的转换
谁能将我的整数的性能击败到 std::string 代码,链接如下?
已经有几个问题解释了如何在 C++ 中将整数转换为 std::string
,例如 这个,但提供的解决方案都不是有效的.
There are already several questions that explain how to convert an integer into a std::string
in C++, such as this one, but none of the solutions provided are efficient.
这里是一些常见的竞争方法的编译就绪代码:
Here is compile-ready code for some common methods to compete against:
- C++ 方式",使用字符串流:http://ideone.com/jh3Sa
- sprintf,SO 人通常向注重性能的人推荐它:http://ideone.com/82kwR
与流行的看法相反, boost::lexical_cast
有自己的实现(whitepaper) 并且不使用 stringstream
和数字插入运算符.我真的很想比较它的性能,因为这个其他问题表明它很悲惨.
Contrary to popular belief, boost::lexical_cast
has its own implementation (white paper) and does not use stringstream
and numeric insertion operators. I'd really like to see its performance compared, because this other question suggests that it's miserable.
还有我自己的贡献,它在台式计算机上具有竞争力,并演示了一种在嵌入式系统上也能全速运行的方法,这与依赖于整数模的算法不同:
And my own contribution, which is competitive on desktop computers, and demonstrates an approach that runs at full speed on embedded systems as well, unlike algorithms dependent on integer modulo:
- Ben 的算法:http://ideone.com/SsEUW
如果您想使用该代码,我将在简化的 BSD 许可下提供它(允许商业使用,需要署名).问问吧.
If you want to use that code, I'll make it available under a simplified BSD license (commercial use allowed, attribution required). Just ask.
最后,函数 ltoa
是非标准的,但可以广泛使用.
Finally, the function ltoa
is non-standard but widely available.
- ltoa 版本,适用于任何拥有提供它的编译器(ideone 没有)的人:http://ideone.com/T5Wim
我很快就会发布我的性能测量结果作为答案.
I'll post my performance measurements as an answer shortly.
- 提供将至少 32 位有符号和无符号整数转换为十进制的代码.
- 将输出生成为
std::string
. - 没有与线程和信号不兼容的技巧(例如,静态缓冲区).
- 您可以假设一个 ASCII 字符集.
- 确保在绝对值无法表示的二进制补码机上测试
INT_MIN
上的代码. - 理想情况下,输出应与使用
stringstream
、http://ideone.com/jh3Sa,但任何可以清楚理解为正确数字的内容也可以. - 新:虽然您可以使用任何您想要的编译器和优化器选项(完全禁用除外)进行比较,但代码也需要至少在 VC++ 2010 和 g++ 下编译并给出正确的结果.
- Provide code for a conversion of at least 32-bit signed and unsigned integers into decimal.
- Produce output as a
std::string
. - No tricks that are incompatible with threading and signals (for example, static buffers).
- You may assume an ASCII character set.
- Make sure to test your code on
INT_MIN
on a two's complement machine where the absolute value is not representable. - Ideally, the output should be character-for-character identical with the canonical C++ version using
stringstream
, http://ideone.com/jh3Sa, but anything that is clearly understandable as the correct number is ok, too. - NEW: Although you can use whatever compiler and optimizer options (except completely disabled) you want for the comparison, the code needs to also compile and give correct results under at least VC++ 2010 and g++.
除了更好的算法之外,我还想在几个不同的平台和编译器上获得一些基准测试(让我们使用 MB/s 吞吐量作为我们的标准度量单位).我相信我的算法的代码(我知道 sprintf
基准采用一些捷径 - 现在已修复)是??标准定义明确的行为,至少在 ASCII 假设下,但如果你看到任何输出无效的未定义行为或输入,请指出.
Besides better algorithms, I'd also like to get some benchmarks on several different platforms and compilers (let's use MB/s throughput as our standard unit of measure). I believe that the code for my algorithm (I know the sprintf
benchmark takes some shortcuts -- now fixed) is well-defined behavior by the standard, at least under the ASCII assumption, but if you see any undefined behavior or inputs for which the output is invalid, please point that out.
针对 g++ 和 VC2010 执行不同的算法,可能是由于 std::string
在每个上的实现不同.VC2010 显然在 NRVO 上做得更好,摆脱仅在 gcc 上有帮助的按值返回.
Different algorithms perform for g++ and VC2010, likely due to the different implementations of std::string
on each. VC2010 clearly does a better job with NRVO, getting rid of return-by-value helped only on gcc.
发现代码的性能比 sprintf
高一个数量级.ostringstream
落后了 50 倍甚至更多.
Code was found that outperforms sprintf
by an order of magnitude. ostringstream
falls behind by a factor of 50 and more.
挑战的获胜者是 user434507,他生成的代码在 gcc 上的运行速度是我自己的 350%.由于 SO 社区的异想天开,其他条目已关闭.
The winner of the challenge is user434507 who produces code that runs 350% of the speed of my own on gcc. Further entries are closed due to the whims of the SO community.
目前(最终?)速度冠军是:
The current (final?) speed champions are:
- 对于 gcc:user434507,比
sprintf
快 8 倍:http://ideone.com/0uhhX - 对于 Visual C++:Timo,比
sprintf
快 15 倍:http://ideone.com/VpKO3
- For gcc: user434507, at 8 times faster than
sprintf
: http://ideone.com/0uhhX - For Visual C++: Timo, at 15 times faster than
sprintf
: http://ideone.com/VpKO3
推荐答案
#include <string>
const char digit_pairs[201] = {
"00010203040506070809"
"10111213141516171819"
"20212223242526272829"
"30313233343536373839"
"40414243444546474849"
"50515253545556575859"
"60616263646566676869"
"70717273747576777879"
"80818283848586878889"
"90919293949596979899"
};
std::string& itostr(int n, std::string& s)
{
if(n==0)
{
s="0";
return s;
}
int sign = -(n<0);
unsigned int val = (n^sign)-sign;
int size;
if(val>=10000)
{
if(val>=10000000)
{
if(val>=1000000000)
size=10;
else if(val>=100000000)
size=9;
else
size=8;
}
else
{
if(val>=1000000)
size=7;
else if(val>=100000)
size=6;
else
size=5;
}
}
else
{
if(val>=100)
{
if(val>=1000)
size=4;
else
size=3;
}
else
{
if(val>=10)
size=2;
else
size=1;
}
}
size -= sign;
s.resize(size);
char* c = &s[0];
if(sign)
*c='-';
c += size-1;
while(val>=100)
{
int pos = val % 100;
val /= 100;
*(short*)(c-1)=*(short*)(digit_pairs+2*pos);
c-=2;
}
while(val>0)
{
*c--='0' + (val % 10);
val /= 10;
}
return s;
}
std::string& itostr(unsigned val, std::string& s)
{
if(val==0)
{
s="0";
return s;
}
int size;
if(val>=10000)
{
if(val>=10000000)
{
if(val>=1000000000)
size=10;
else if(val>=100000000)
size=9;
else
size=8;
}
else
{
if(val>=1000000)
size=7;
else if(val>=100000)
size=6;
else
size=5;
}
}
else
{
if(val>=100)
{
if(val>=1000)
size=4;
else
size=3;
}
else
{
if(val>=10)
size=2;
else
size=1;
}
}
s.resize(size);
char* c = &s[size-1];
while(val>=100)
{
int pos = val % 100;
val /= 100;
*(short*)(c-1)=*(short*)(digit_pairs+2*pos);
c-=2;
}
while(val>0)
{
*c--='0' + (val % 10);
val /= 10;
}
return s;
}
这会在不允许未对齐内存访问的系统上爆炸(在这种情况下,通过 *(short*)
进行的第一个未对齐分配将导致段错误),但否则应该会很好地工作.
This will blow up on systems that disallow unaligned memory accesses (in which case, the first unaligned assignment via *(short*)
would cause a segfault), but should work very nicely otherwise.
要做的一件重要事情是尽量减少std::string
的使用.(讽刺的是,我知道.)例如,在 Visual Studio 中,对 std::string 方法的大多数调用都没有内联,即使您在编译器选项中指定/Ob2 也是如此.因此,即使像调用 std::string::clear()
这样微不足道的事情,您可能期望它非常快,但在将 CRT 作为静态库链接时也可能需要 100 个时钟信号,并且作为 DLL 链接时为 300 个时钟信号.
One important thing to do is to minimize the use of std::string
. (Ironic, I know.) In Visual Studio, for example, most calls to methods of std::string are not inlined, even if you specify /Ob2 in compiler options. So even something as trivial as a call to std::string::clear()
, which you might expect to be very fast, can take 100 clockticks when linking CRT as a static library, and as much as 300 clockticks when linking as a DLL.
出于同样的原因,通过引用返回更好,因为它避免了赋值、构造函数和析构函数.
For the same reason, returning by reference is better because it avoids an assignment, a constructor and a destructor.
相关文章