为什么允许将指针转换为引用?

2022-01-05 00:00:00 reference pointers casting c++

最初是这个问题的主题，后来发现 OP只是忽略了取消引用.同时，这个答案让我和其他一些人思考 - 为什么是否允许使用 C 样式转换或 reinterpret_cast 转换指向引用的指针?

Originally being the topic of this question, it emerged that the OP just overlooked the dereference. Meanwhile, this answer got me and some others thinking - why is it allowed to cast a pointer to a reference with a C-style cast or reinterpret_cast?

int main() { char c = 'A'; char* pc = &c; char& c1 = (char&)pc; char& c2 = reinterpret_cast<char&>(pc); }

上面的代码在 Visual Studio 上编译时没有任何警告或错误(关于演员表)，而 GCC 只会给你一个警告，如图所示这里.

The above code compiles without any warning or error (regarding the cast) on Visual Studio while GCC will only give you a warning, as shown here.

我的第一个想法是指针以某种方式自动取消引用(我通常使用 MSVC，所以我没有收到 GCC 显示的警告)，并尝试了以下操作:

My first thought was that the pointer somehow automagically gets dereferenced (I work with MSVC normally, so I didn't get the warning GCC shows), and tried the following:

#include <iostream> int main() { char c = 'A'; char* pc = &c; char& c1 = (char&)pc; std::cout << *pc << " "; c1 = 'B'; std::cout << *pc << " "; }

此处显示了非常有趣的输出.所以看起来您正在访问指向的变量，但同时，您不是.

With the very interesting output shown here. So it seems that you are accessing the pointed-to variable, but at the same time, you are not.

想法?解释?标准报价?

Ideas? Explanations? Standard quotes?

推荐答案

嗯，这就是 reinterpret_cast 的目的！顾名思义，该转换的目的是将内存区域重新解释为另一种类型的值.出于这个原因，使用 reinterpret_cast 您总是可以将一种类型的左值转换为另一种类型的引用.

Well, that's the purpose of reinterpret_cast! As the name suggests, the purpose of that cast is to reinterpret a memory region as a value of another type. For this reason, using reinterpret_cast you can always cast an lvalue of one type to a reference of another type.

这在语言规范的 5.2.10/10 中有描述.它还说 reinterpret_cast(x) 与 *reinterpret_cast(&x) 是一回事.

This is described in 5.2.10/10 of the language specification. It also says there that reinterpret_cast<T&>(x) is the same thing as *reinterpret_cast<T*>(&x).

在这种情况下，您正在投射指针这一事实完全不重要.不，指针不会自动解除引用(考虑到 *reinterpret_cast(&x) 解释，人们甚至可能会说相反的情况:该指针的地址是自动取的).在这种情况下，指针只是占用内存中某个区域的某个变量".该变量的类型没有任何区别.它可以是 double、指针、int 或任何其他左值.该变量被简单地视为您重新解释为另一种类型的内存区域.

The fact that you are casting a pointer in this case is totally and completely unimportant. No, the pointer does not get automatically dereferenced (taking into account the *reinterpret_cast<T*>(&x) interpretation, one might even say that the opposite is true: the address of that pointer is automatically taken). The pointer in this case serves as just "some variable that occupies some region in memory". The type of that variable makes no difference whatsoever. It can be a double, a pointer, an int or any other lvalue. The variable is simply treated as memory region that you reinterpret as another type.

至于 C 风格的强制转换 - 在这个上下文中它只是被解释为 reinterpret_cast，所以上面的内容立即适用于它.

As for the C-style cast - it just gets interpreted as reinterpret_cast in this context, so the above immediately applies to it.

在您的第二个示例中，您将引用 c 附加到由指针变量 pc 占用的内存.当您执行 c = 'B' 时，您强行将值 'B' 写入该内存，从而完全破坏了原始指针值(通过覆盖该值的一个字节).现在被破坏的指针指向一些不可预测的位置.后来你试图取消引用那个被破坏的指针.在这种情况下会发生什么纯粹是运气问题.程序可能会崩溃，因为指针通常是不可延迟的.或者你可能很幸运，让你的指针指向一些不可预测但有效的位置.在这种情况下，您的程序将输出一些内容.没有人知道它会输出什么，也没有任何意义.

In your second example you attached reference c to the memory occupied by pointer variable pc. When you did c = 'B', you forcefully wrote the value 'B' into that memory, thus completely destroying the original pointer value (by overwriting one byte of that value). Now the destroyed pointer points to some unpredictable location. Later you tried to dereference that destroyed pointer. What happens in such case is a matter of pure luck. The program might crash, since the pointer is generally non-defererencable. Or you might get lucky and make your pointer to point to some unpredictable yet valid location. In that case you program will output something. No one knows what it will output and there's no meaning in it whatsoever.

您可以将您的第二个程序重写为无需引用的等效程序

One can rewrite your second program into an equivalent program without references

int main(){ char* pc = new char('A'); char* c = (char *) &pc; std::cout << *pc << " "; *c = 'B'; std::cout << *pc << " "; }

从实用的角度来看，在小端平台上，您的代码会覆盖指针的最低有效字节.这样的修改不会使指针指向离其原始位置太远的地方.因此，代码更有可能打印某些内容而不是崩溃.在大端平台上，您的代码会破坏指针的最高有效字节，从而疯狂地将其指向一个完全不同的位置，从而使您的程序更有可能崩溃.

From the practical point of view, on a little-endian platform your code would overwrite the least-significant byte of the pointer. Such a modification will not make the pointer to point too far away from its original location. So, the code is more likely to print something instead of crashing. On a big-endian platform your code would destroy the most-significant byte of the pointer, thus throwing it wildly to point to a totally different location, thus making your program more likely to crash.

相关文章