g++ 是否满足 std::string C++11 要求
考虑以下示例:
int main()
{
string x = "hello";
//copy constructor has been called here.
string y(x);
//c_str return const char*, but this usage is quite popular.
char* temp = (char*)y.c_str();
temp[0] = 'p';
cout << "x = " << x << endl;
cout << "y = " << y << endl;
cin >> x;
return 0;
}
在 Visual Studio 编译器和 g++ 上运行它.当我这样做时,我得到了两个不同的结果.
在 g++ 中:
Run it on visual studio compiler and on g++.
When I did so, I got two different results.
in g++:
x = pello
y = pello
在视觉工作室 2010 中:
In visual studio 2010:
x = hello
y = pello
产生差异的原因很可能是 g++ std::string 实现使用了 COW(写入时复制)技术,而 Visual Studio 没有.
The reason for the diff is most likely that g++ std::string implementation uses COW (copy on write) techniques and visual studio does not.
现在 C++ 标准(第 616 页表 64)说明了字符串复制构造函数
Now the C++ standard (page 616 table 64) states with regards to string copy constructor
basic_string(const basic_string& str):
效果:data()
应该指向数组的已分配副本的第一个元素,该数组的第一个元素由 str.data()
指向"
effects:
data()
should "points at the ?rst element of an allocated copy of the array whose ?rst element is pointed at by str.data()
"
意思是 COW 是不允许的(至少在我的理解中).
怎么可能?
g++ 是否满足 std::string
C++11 要求?
Meaning COW is not allowed (at least to my understanding).
How can that be?
Does g++ meets std::string
C++11 requirements?
在 C++11 之前,这并没有造成大问题,因为 c_str
没有返回指向字符串对象保存的实际数据的指针,因此更改它并不重要.但是在更改之后,这种 COW + 返回实际指针的组合可以并且破坏旧的应用程序(由于编码错误而值得它的应用程序,但尽管如此).
Before C++11 this did not pose a big problem since c_str
didn't return a pointer to the actual data the string object holds, so changing it didn't matter. But after the change this combination of COW + returning the actual pointer can and breaks old applications (applications that deserve it for bad coding but nevertheless).
你同意我的观点吗?如果是,可以做些什么吗?有没有人知道如何在一个非常大的旧代码环境中处理它(抓住这个的发条规则会很好).
Do you agree with me? If yes, can something be done? Does anyone have an idea about how to go at it in a very big old code environments (a clockwork rule to catch this would be nice).
请注意,即使不强制转换 const,也可能会通过调用 c_str、保存指针然后调用非 const 方法(这将导致写入)导致指针无效.
另一个没有抛弃 constness 的例子:
Note that even without casting the constness away, one might cause invalidation of a pointer by calling c_str, saving the pointer and then calling non-const method (which will cause write).
Another example without casting the constness away:
int main()
{
string x = "hello";
//copy constructor has been called here.
string y(x);
//y[0] = 'p';
//c_str return const char*, but this usage is quite popular.
const char* temp = y.c_str();
y[0] = 'p';
//Now we expect "pello" because the standart says the pointer points to the actual data
//but we will get "hello"
cout << "temp = " << temp << endl;
return 0;
}
推荐答案
你说得对,COW 是不允许的.但 GCC 尚未更新其实现,据称 由于 ABI 限制.ext/vstring.h
可以找到一个新的实现,最终旨在取代 std::string
实现.
You're right that COW is disallowed. But GCC hasn't updated its implementation yet, allegedly due to ABI constraints. A new implementation, designed eventually to supplant the std::string
implementation, can be found as ext/vstring.h
.
libstdc++ 的 std::string
,虽然不是这个,但不会进入 GCC 4.9;Jonathan 在该错误中指出,到目前为止,它仅针对 vstring
进行了修复.那么,我的猜测是,COW 问题将在同一时间得到解决.
A bug in libstdc++'s std::string
, albeit not this one, is not going to make it into GCC 4.9; Jonathan indicates on the bug that it has only been fixed for vstring
so far. My guess would be, then, that the COW issue would be resolved around the same time.
尽管如此,抛弃 const
ness 然后变异几乎总是一个坏主意:尽管您是正确的,这在实践中对于完全符合 C++11 的字符串应该是安全的实施时,您正在做出假设,而这个问题证明您不能总是依赖这些假设来持有.因此,虽然您的代码示例可能很流行",但它在糟糕的代码中很流行,甚至现在也不应该编写.而且,当然,用 C++03 编写是完全无能的!
Despite all this, casting away const
ness then mutating is pretty much always a bad idea: though you're correct that this should in practice be safe with a fully C++11-compliant string implementation, you're making assumptions and this very problem proves that you cannot always rely on those assumptions to hold. So, while your code example may be "popular", it's popular in poor code, and shouldn't be written even now. And, of course, writing that in C++03 is flat-out incompetence!
相关文章