通过下标获取一个过去的数组元素的地址:C++ 标准是否合法?

2022-01-31 00:00:00 c standards language-lawyer c++

我已经看到它多次断言 C++ 标准不允许以下代码:

I have seen it asserted several times now that the following code is not allowed by the C++ Standard:

int array[5]; int *array_begin = &array[0]; int *array_end = &array[5];

&array[5] 在这种情况下是合法的 C++ 代码吗?

Is &array[5] legal C++ code in this context?

如果可能的话，我想要一个参考标准的答案.

I would like an answer with a reference to the Standard if possible.

知道它是否符合 C 标准也很有趣.如果它不是标准 C++，为什么决定将其与 array + 5 或 &array[4] + 1 区别对待?

It would also be interesting to know if it meets the C standard. And if it isn't standard C++, why was the decision made to treat it differently from array + 5 or &array[4] + 1?

推荐答案

您的示例是合法的，但这只是因为您实际上并没有使用越界指针.

Your example is legal, but only because you're not actually using an out of bounds pointer.

让我们先处理越界指针(因为这就是我最初解释你的问题的方式，在我注意到该示例使用了一个过去的指针之前):

Let's deal with out of bounds pointers first (because that's how I originally interpreted your question, before I noticed that the example uses a one-past-the-end pointer instead):

一般来说，您甚至不允许创建越界指针.一个指针必须指向数组中的一个元素，或者末尾的一个元素.无处可去.

In general, you're not even allowed to create an out-of-bounds pointer. A pointer must point to an element within the array, or one past the end. Nowhere else.

甚至不允许指针存在，这意味着您显然也不允许取消引用它.

The pointer is not even allowed to exist, which means you're obviously not allowed to dereference it either.

以下是标准对该主题的规定:

Here's what the standard has to say on the subject:

5.7:5:

当一个表达式具有整数类型被添加到或减去指针，结果的类型为指针操作数.如果指针操作数指向一个元素数组对象，并且数组很大够了，结果指向一个元素与原始元素的偏移元素这样的差异结果的下标和原始数组元素等于积分表达式.换句话说，如果表达式 P 指向第 i 个数组对象的元素，表达式 (P)+N(等价于，N+(P)) 和 (P)-N(其中 N 具有值 n) 分别指向的 i+n-th 和 i-n-th 元素数组对象，前提是它们存在.此外，如果表达式 P 指向到数组的最后一个元素对象，表达式 (P)+1 分过去数组的最后一个元素对象，如果表达式 Q 指向一个数组的最后一个元素对象，表达式 (Q)-1 指向数组对象的最后一个元素.如果指针操作数和结果指向相同的元素数组对象，或最后一个数组对象的元素，评估不应产生溢出；否则，行为是未定义.

When an expression that has integral type is added to or subtracted from a pointer, the result has the type of the pointer operand. If the pointer operand points to an element of an array object, and the array is large enough, the result points to an element offset from the original element such that the difference of the subscripts of the resulting and original array elements equals the integral expression. In other words, if the expression P points to the i-th element of an array object, the expressions (P)+N (equivalently, N+(P)) and (P)-N (where N has the value n) point to, respectively, the i+n-th and i?n-th elements of the array object, provided they exist. Moreover, if the expression P points to the last element of an array object, the expression (P)+1 points one past the last element of the array object, and if the expression Q points one past the last element of an array object, the expression (Q)-1 points to the last element of the array object. If both the pointer operand and the result point to elements of the same array object, or one past the last element of the array object, the evaluation shall not produce an over?ow; otherwise, the behavior is unde?ned.

(强调我的)

当然，这是给operator+的.因此，为了确定，以下是标准对数组下标的说明:

Of course, this is for operator+. So just to be sure, here's what the standard says about array subscripting:

5.2.1:1:

表达式 E1[E2] 与 *((E1)+(E2)) 相同(根据定义)

当然，有一个明显的警告:您的示例实际上并未显示越界指针.它使用一个过去的结束"指针，这是不同的.指针是允许存在的(如上所述)，但据我所知，标准并没有说明取消引用它.我能找到的最接近的是 3.9.2:3:

Of course, there's an obvious caveat: Your example doesn't actually show an out-of-bounds pointer. it uses a "one past the end" pointer, which is different. The pointer is allowed to exist (as the above says), but the standard, as far as I can see, says nothing about dereferencing it. The closest I can find is 3.9.2:3:

[注意:例如，数组末尾的地址(5.7)将被视为指向可能位于该地址的数组元素类型的不相关对象.――尾注]

[Note: for instance, the address one past the end of an array (5.7) would be considered to point to an unrelated object of the array’s element type that might be located at that address. ―end note ]

在我看来，这意味着是的，您可以合法地取消引用它，但读取或写入该位置的结果是未指定的.

Which seems to me to imply that yes, you can legally dereference it, but the result of reading or writing to the location is unspecified.

感谢 ilproxyil 在这里纠正最后一点，回答了您问题的最后一部分:

Thanks to ilproxyil for correcting the last bit here, answering the last part of your question:

array + 5 实际上并没有取消引用任何东西，它只是创建一个指向末尾的指针数组.
&array[4] + 1 解引用array+4 (非常安全)，获取该左值的地址，并且向该地址添加一个，这导致一个过去的指针(但那个指针永远不会得到已取消引用.
&array[5] 解引用 array+5(据我所知是合法的，并导致一个不相关的对象数组的元素类型"，作为上面说)，然后采取该元素的地址，它也似乎足够合法.

array + 5 doesn't actually dereference anything, it simply creates a pointer to one past the end of array.

&array[4] + 1 dereferences array+4 (which is perfectly safe), takes the address of that lvalue, and adds one to that address, which results in a one-past-the-end pointer (but that pointer never gets dereferenced.

&array[5] dereferences array+5 (which as far as I can see is legal, and results in "an unrelated object of the array’s element type", as the above said), and then takes the address of that element, which also seems legal enough.

所以他们做的事情并不完全相同，尽管在这种情况下，最终结果是一样的.

So they don't do quite the same thing, although in this case, the end result is the same.

相关文章