std::string 和字符串文字之间的不一致
我发现 std::string
和 C++0x 中的字符串文字之间存在令人不安的不一致:
I have discovered a disturbing inconsistency between std::string
and string literals in C++0x:
#include <iostream>
#include <string>
int main()
{
int i = 0;
for (auto e : "hello")
++i;
std::cout << "Number of elements: " << i << '
';
i = 0;
for (auto e : std::string("hello"))
++i;
std::cout << "Number of elements: " << i << '
';
return 0;
}
输出为:
Number of elements: 6
Number of elements: 5
我理解为什么会发生这种情况的机制:字符串文字实际上是一个包含空字符的字符数组,并且当基于范围的 for 循环调用 std::end()
在字符数组上,它得到一个指针,越过数组的末尾;由于空字符是数组的一部分,因此它会得到一个指针越过空字符.
I understand the mechanics of why this is happening: the string literal is really an array of characters that includes the null character, and when the range-based for loop calls std::end()
on the character array, it gets a pointer past the end of the array; since the null character is part of the array, it thus gets a pointer past the null character.
然而,我认为这是非常不可取的:当涉及到像长度这样基本的属性时,std::string
和字符串文字肯定应该表现相同吗?
However, I think this is very undesirable: surely std::string
and string literals should behave the same when it comes to properties as basic as their length?
有没有办法解决这种不一致的问题?例如,可以为字符数组重载 std::begin()
和 std::end()
以便它们分隔的范围不包括终止的空字符吗?如果是这样,为什么不这样做?
Is there a way to resolve this inconsistency? For example, can std::begin()
and std::end()
be overloaded for character arrays so that the range they delimit does not include the terminating null character? If so, why was this not done?
编辑:为了向那些说我只是遭受使用 C 样式字符串(遗留功能")的后果的人更多地表达我的愤慨,请考虑如下代码以下:
EDIT: To justify my indignation a bit more to those who have said that I'm just suffering the consequences of using C-style strings which are a "legacy feature", consider code like the following:
template <typename Range>
void f(Range&& r)
{
for (auto e : r)
{
...
}
}
您是否希望 f("hello")
和 f(std::string("hello"))
做一些不同的事情?
Would you expect f("hello")
and f(std::string("hello"))
to do something different?
推荐答案
可以使用 C++0x 工具箱中的另一个工具解决不一致问题:用户定义的文字.使用适当定义的用户定义文字:
The inconsistency can be resolved using another tool in C++0x's toolbox: user-defined literals. Using an appropriately-defined user-defined literal:
std::string operator""s(const char* p, size_t n)
{
return string(p, n);
}
我们将能够写:
int i = 0;
for (auto e : "hello"s)
++i;
std::cout << "Number of elements: " << i << '
';
现在输出预期的数字:
Number of elements: 5
有了这些新的 std::string 文字,可以说再也没有理由使用 C 风格的字符串文字了.
With these new std::string literals, there is arguably no more reason to use C-style string literals, ever.
相关文章