初始化是否需要左值到右值的转换?是`int x = x;` UB 吗?

C++ 标准在 3.3.2声明点"中包含一个半著名的令人惊讶"名称查找示例:

The C++ standard contains a semi-famous example of "surprising" name lookup in 3.3.2, "Point of declaration":

int x = x;

这用自身初始化 x,它(作为原始类型)未初始化,因此具有不确定的值(假设它是一个自动变量).

This initializes x with itself, which (being a primitive type) is uninitialized and thus has an indeterminate value (assuming it is an automatic variable).

这实际上是未定义的行为吗?

Is this actually undefined behaviour?

根据4.1左值到右值转换",对未初始化的值执行左值到右值转换是未定义的行为.右手边的 x 是否进行了这种转换?如果是这样,该示例实际上是否会有未定义的行为?

According to 4.1 "Lvalue-to-rvalue conversion", it is undefined behaviour to perform lvalue-to-rvalue conversion on an uninitialized value. Does the right-hand x undergo this conversion? If so, would the example actually have undefined behaviour?

推荐答案

更新: 根据评论中的讨论,我在此答案的末尾添加了更多证据.

免责声明:我承认这个答案是推测性的.另一方面,C++11 标准的当前表述似乎不允许提供更正式的答案.

在 这个问答,发现C++11标准没有正式指定什么值类别 是每种语言结构所期望的.下面我将主要关注内置操作符,尽管问题是关于初始化器.最终,我会将我从运算符的情况得出的结论扩展到初始化程序的情况.

In the context of this Q&A, it has emerged that the C++11 Standard fails to formally specify what value categories are expected by each language construct. In the following I will mostly focus on built-in operators, although the question is about initializers. Eventually, I will end up extending the conclusions I drew for the case of operators to the case of initializers.

在内置运算符的情况下,尽管缺乏正式的规范,但在标准中发现了(非规范性的)证据,预期规范是让在需要值的任何地方以及未另行指定时,均应使用纯右值.

In the case of built-in operators, in spite of the lack of a formal specification, (non-normative) evidences are found in the Standard that the intended specification is to let prvalues be expected wherever a value is needed, and when not specified otherwise.

例如,第 3.10/1 段中的注释说:

For instance, a note in Paragraph 3.10/1 says:

第 5 条中对每个内置运算符的讨论指出了它产生的值的类别以及它期望的操作数的值类别.例如,内置赋值运算符期望左操作数是左值,右操作数是右值并产生左值作为结果. 用户定义的运算符是函数,类别它们期望和产生的值由它们的参数和返回类型决定

The discussion of each built-in operator in Clause 5 indicates the category of the value it yields and the value categories of the operands it expects. For example, the built-in assignment operators expect that the left operand is an lvalue and that the right operand is a prvalue and yield an lvalue as the result. User-defined operators are functions, and the categories of values they expect and yield are determined by their parameter and return types

另一方面,关于赋值运算符的第 5.17 节没有提到这一点.但是,在注释(第 5.17/1 段)中再次提到了执行左值到右值转换的可能性:

Section 5.17 on assignment operators, on the other hand, does not mention this. However, the possibility of performing an lvalue-to-rvalue conversion is mentioned, again in a note (Paragraph 5.17/1):

因此,函数调用不应干预左值到右值的转换和与任何单个复合赋值运算符相关的副作用

Therefore, a function call shall not intervene between the lvalue-to-rvalue conversion and the side effect associated with any single compound assignment operator

当然,如果不期望右值,则此注释将毫无意义.

Of course, if no rvalue were expected, this note would be meaningless.

另一个证据是在 4/8 中发现的,正如 Johannes Schaub 在对链接问答:

Another evidence is found in 4/8, as pointed out by Johannes Schaub in the comments to linked Q&A:

在某些情况下,某些转换会被抑制.例如,不在一元 & 的操作数上进行左值到右值的转换.操作员.在这些运算符和上下文的描述中给出了特定的例外情况.

There are some contexts where certain conversions are suppressed. For example, the lvalue-to-rvalue conversion is not done on the operand of the unary & operator. Specific exceptions are given in the descriptions of those operators and contexts.

这似乎意味着对内置运算符的所有操作数都执行左值到右值的转换,除非另有说明.这意味着,除非另有说明,否则右值应作为内置运算符的操作数.

This seems to imply that lvalue-to-rvalue conversion is performed on all operands of built-in operators, except when specified otherwise. This would mean, in turn, that rvalues are expected as operands of built-in operators unless specified otherwise.

推测:

尽管初始化不是赋值,因此操作符没有进入讨论,但我怀疑规范的这一领域受到上述相同问题的影响.

Even though initialization is not assignment, and therefore operators do not enter the discussion, my suspicion is that this area of the specification is affected by the very same problem described above.

甚至可以在第 8.5.2/5 段中找到支持这种信念的痕迹,关于引用的初始化(对于它来说不需要左值初始化表达式的值):

Traces supporting this belief can be found even in Paragraph 8.5.2/5, about the initialization of references (for which the value of the lvalue initializer expression is not needed):

不需要通常左值到右值 (4.1)、数组到指针 (4.2) 和函数到指针 (4.3) 的标准转换,因此被抑制,当这种对左值的直接绑定完成时.

The usual lvalue-to-rvalue (4.1), array-to-pointer (4.2), and function-to-pointer (4.3) standard conversions are not needed, and therefore are suppressed, when such direct bindings to lvalues are done.

通常"这个词似乎暗示在初始化不是引用类型的对象时,应该应用左值到右值的转换.

The word "usual" seems to imply that when initializing objects which are not of a reference type, lvalue-to-rvalue conversion is meant to apply.

因此,我认为虽然对初始化器的期望值类别的要求没有明确规定(如果不是完全缺失),但根据证据,假设预期规范是:

Therefore, I believe that although requirements on the expected value category of initializers are ill-specified (if not completely missing), on the grounds of the evidences provided it makes sense to assume that the intended specification is that:

在语言结构需要值的地方,除非另有说明,否则预期为纯右值.

在此假设下,您的示例中需要进行左值到右值的转换,这将导致未定义行为.

Under this assumption, an lvalue-to-rvalue conversion would be required in your example, and that would lead to Undefined Behavior.

其他证据:

只是为了提供进一步的证据来支持这个猜想,让我们假设它错误,这样复制初始化确实不需要左值到右值的转换,并考虑以下代码(感谢jogojapan 贡献):

Just to provide further evidence to support this conjecture, let's assume it wrong, so that no lvalue-to-rvalue conversion is indeed required for copy-initialization, and consider the following code (thanks to jogojapan for contributing):

int y;
int x = y; // No UB
short t;
int u = t; // UB! (Do not like this non-uniformity, but could accept it)
int z;
z = x; // No UB (x is not uninitialized)
z = y; // UB! (Assuming assignment operators expect a prvalue, see above)
       // This would be very counterintuitive, since x == y

这种不统一的行为对我来说没有多大意义.IMO 更有意义的是,无论何处需要值,都需要一个纯右值.

This non-uniform behavior does not make a lot of sense to me. What makes more sense IMO is that wherever a value is required, a prvalue is expected.

此外,正如 Jesse Good 在他的回答中正确指出的那样,C++ 标准的关键段落是 8.5/16:

Moreover, as Jesse Good correctly points out in his answer, the key Paragraph of the C++ Standard is 8.5/16:

――否则,被初始化的对象的初始值是(可能已转换)初始化表达式的值.标准如有必要,将使用转换(第 4 条)来转换初始化表达式到 cv 非限定版本的目的地类型;不考虑用户定义的转换.如果无法进行转换,初始化格式错误.[ 笔记:cv1 T"类型的表达式可以初始化cv2 T"类型的对象独立于 cv 限定符 cv1 和 cv2.

― Otherwise, the initial value of the object being initialized is the (possibly converted) value of the initializer expression. Standard conversions (Clause 4) will be used, if necessary, to convert the initializer expression to the cv-unqualified version of the destination type; no user-defined conversions are considered. If the conversion cannot be done, the initialization is ill-formed. [ Note: An expression of type "cv1 T" can initialize an object of type "cv2 T" independently of the cv-qualifiers cv1 and cv2.

然而,虽然 Jesse 主要关注if必要"这一点,但我也想强调类型"这个词.上面的段落提到了if必要"将使用标准转换来转换为目标类型,但没有说明类别转换:

However, while Jesse mainly focuses on the "if necessary" bit, I would also like to stress the word "type". The paragraph above mentions that standard conversions will be used "if necessary" to convert to the destination type, but does not say anything about category conversions:

  1. 是否会在需要时执行类别转换?
  2. 需要它们吗?

关于第二个问题,正如答案的原始部分所讨论的,C++11标准目前没有指定是否需要类别转换,因为没有提到复制初始化是否需要纯右值作为初始化程序.因此,不可能给出明确的答案.但是,我相信我提供了足够的证据来假设这是预期的规范,因此答案是是".

For what concerns the second question, as discussed in the original part of the answer, the C++11 Standard currently does not specify whether category conversions are needed or not, because nowhere it is mentioned whether copy-initialization expects a prvalue as an initializer. Thus, a clear-cut answer is impossible to give. However, I believe I provided enough evidence to assume this to be the intended specification, so that the answer would be "Yes".

至于第一个问题,我认为答案也是是"似乎是合理的.如果它是否",显然正确的程序将是格式错误的:

As for the first question, it seems reasonable to me that the answer is "Yes" as well. If it were "No", obviously correct programs would be ill-formed:

int y = 0;
int x = y; // y is lvalue, prvalue expected (assuming the conjecture is correct)

总结起来(A1 =问题1的答案",A2 =问题2的答案"):

To sum it up (A1 = "Answer to question 1", A2 = "Answer to question 2"):

          | A2 = Yes   | A2 = No |
 ---------|------------|---------|
 A1 = Yes |     UB     |  No UB  | 
 A1 = No  | ill-formed |  No UB  |
 ---------------------------------

如果 A2 为否",则 A1 无关紧要:没有 UB,但是第一个示例的奇怪情况(例如 z = y 给出 UB,但不是 z = x 即使 x == y) 出现.如果 A2 是是",另一方面,A1 变得至关重要;然而,已经提供了足够的证据来证明它会是".

If A2 is "No", A1 does not matter: there's no UB, but the bizarre situations of the first example (e.g. z = y giving UB, but not z = x even though x == y) show up. If A2 is "Yes", on the other hand, A1 becomes crucial; yet, enough evidence has been given to prove it would be "Yes".

因此,我的论点是 A1 = "Yes" 和 A2 = "Yes",我们应该有未定义行为.

Therefore, my thesis is that A1 = "Yes" and A2 = "Yes", and we should have Undefined Behavior.

进一步的证据:

这个缺陷报告(由 Jesse Good 提供a>) 提出了一个旨在在这种情况下提供未定义行为的更改:

This defect report (courtesy of Jesse Good) proposes a change that is aimed at giving Undefined Behavior in this case:

[...] 此外,4.1 [conv.lval] 第 1 段说将左值到右值转换应用于未初始化的对象"会导致未定义的行为;这应该改写为具有不确定值的对象.

[...] In addition, 4.1 [conv.lval] paragraph 1 says that applying the lvalue-to-rvalue conversion to an "object [that] is uninitialized" results in undefined behavior; this should be rephrased in terms of an object with an indeterminate value.

特别是,第 4.1 段的拟议措辞说:

In particular, the proposed wording for Paragraph 4.1 says:

当在未计算的操作数或其子表达式(第 5 条 [expr])中发生左值到右值的转换时,引用对象中包含的值不会被访问.在所有其他情况下,转换结果根据以下规则确定:

When an lvalue-to-rvalue conversion occurs in an unevaluated operand or a subexpression thereof (Clause 5 [expr]) the value contained in the referenced object is not accessed. In all other cases, the result of the conversion is determined according to the following rules:

――如果 T 是(可能有 cv 限定的)std::nullptr_t,结果是一个空指针常量(4.10 [conv.ptr]).

― If T is (possibly cv-qualified) std::nullptr_t, the result is a null pointer constant (4.10 [conv.ptr]).

――否则,如果泛左值 T 具有类类型,则转换会从泛左值复制初始化 T 类型的临时值,并且转换的结果是临时值的纯右值.

― Otherwise, if the glvalue T has a class type, the conversion copy-initializes a temporary of type T from the glvalue and the result of the conversion is a prvalue for the temporary.

――否则,如果泛左值引用的对象包含无效的指针值(3.7.4.2 [basic.stc.dynamic.deallocation], 3.7.4.3 [basic.stc.dynamic.safety]),则行为为实现定义.

― Otherwise, if the object to which the glvalue refers contains an invalid pointer value (3.7.4.2 [basic.stc.dynamic.deallocation], 3.7.4.3 [basic.stc.dynamic.safety]), the behavior is implementation-defined.

――否则,如果 T 是(可能是 cv 限定的)无符号字符类型(3.9.1 [basic.fundamental]),并且泛左值所指的对象包含不确定值(5.3.4 [expr.new], 8.5 [dcl.init], 12.6.2 [class.base.init]),并且该对象没有自动存储持续时间或者泛左值是一元 & 的操作数.运算符或者它被绑定到一个引用,结果是一个未指定的值.[脚注:每次将左值到右值转换应用于对象时,该值可能会有所不同.分配给寄存器的具有不确定值的无符号字符对象可能会陷入陷阱.――结束脚注]

― Otherwise, if T is a (possibly cv-qualified) unsigned character type (3.9.1 [basic.fundamental]), and the object to which the glvalue refers contains an indeterminate value (5.3.4 [expr.new], 8.5 [dcl.init], 12.6.2 [class.base.init]), and that object does not have automatic storage duration or the glvalue was the operand of a unary & operator or it was bound to a reference, the result is an unspecified value. [Footnote: The value may be different each time the lvalue-to-rvalue conversion is applied to the object. An unsigned char object with indeterminate value allocated to a register might trap. ―end footnote]

― 否则,如果泛左值引用的对象包含不确定值,则行为未定义.

――否则,如果泛左值具有(可能有 cv 限定的)类型 std::nullptr_t,则纯右值结果是空指针常量 (4.10 [conv.ptr]).否则,泛左值所指示的对象中包含的值就是纯右值结果.

― Otherwise, if the glvalue has (possibly cv-qualified) type std::nullptr_t, the prvalue result is a null pointer constant (4.10 [conv.ptr]). Otherwise, the value contained in the object indicated by the glvalue is the prvalue result.

相关文章