为什么 unsigned int 0xFFFFFFFF 等于 int -1?

2021-12-31 00:00:00 binary c casting c++

在 C 或 C++ 中,据说 size_t(无符号 int 数据类型)可以容纳的最大数量与将 -1 转换为该数据类型相同.例如,请参阅 size_t 的无效值

In C or C++ it is said that the maximum number a size_t (an unsigned int data type) can hold is the same as casting -1 to that data type. for example see Invalid Value for size_t

为什么?

我的意思是,(谈论 32 位整数)AFAIK 最高有效位保存有符号数据类型的符号(即位 0x80000000 以形成负数).那么,1 是 0x00000001.. 0x7FFFFFFFF 是 int 数据类型可以容纳的最大正数.

I mean, (talking about 32 bit ints) AFAIK the most significant bit holds the sign in a signed data type (that is, bit 0x80000000 to form a negative number). then, 1 is 0x00000001.. 0x7FFFFFFFF is the greatest positive number a int data type can hold.

然后,AFAIK -1 int 的二进制表示应该是 0x80000001 (也许我错了).为什么/如何在将整数转换为无符号时将这个二进制值转换为完全不同的(0xFFFFFFFF)?或者..如何从 0xFFFFFFFF 中形成二进制 -1?

Then, AFAIK the binary representation of -1 int should be 0x80000001 (perhaps I'm wrong). why/how this binary value is converted to anything completely different (0xFFFFFFFF) when casting ints to unsigned?? or.. how is it possible to form a binary -1 out of 0xFFFFFFFF?

我毫不怀疑在 C: ((unsigned int)-1) == 0xFFFFFFFF 或 ((int)0xFFFFFFFF) == -1 与 1 + 1 == 2 一样正确,我只是想知道为什么.

I have no doubt that in C: ((unsigned int)-1) == 0xFFFFFFFF or ((int)0xFFFFFFFF) == -1 is equally true than 1 + 1 == 2, I'm just wondering why.

推荐答案

C 和 C++ 可以在许多不同的体系结构和机器类型上运行.因此,它们可以有不同的数字表示形式:二的补码和一的补码是最常见的.通常,您不应依赖程序中的特定表示.

C and C++ can run on many different architectures, and machine types. Consequently, they can have different representations of numbers: Two's complement, and Ones' complement being the most common. In general you should not rely on a particular representation in your program.

对于无符号整数类型(size_t 是其中之一),C 标准(我认为也是 C++ 标准)指定了精确的溢出规则.简而言之,如果SIZE_MAXsize_t类型的最大值,则表达式

For unsigned integer types (size_t being one of those), the C standard (and the C++ standard too, I think) specifies precise overflow rules. In short, if SIZE_MAX is the maximum value of the type size_t, then the expression

(size_t) (SIZE_MAX + 1)

保证为0,因此,您可以确定(size_t) -1 等于SIZE_MAX.其他无符号类型也是如此.

is guaranteed to be 0, and therefore, you can be sure that (size_t) -1 is equal to SIZE_MAX. The same holds true for other unsigned types.

请注意,上述内容成立:

Note that the above holds true:

  • 对于所有无符号类型,
  • 即使底层机器不表示二进制补码中的数字.在这种情况下,编译器必须确保身份成立.
  • for all unsigned types,
  • even if the underlying machine doesn't represent numbers in Two's complement. In this case, the compiler has to make sure the identity holds true.

此外,以上意味着您不能依赖 signed 类型的特定表示.

Also, the above means that you can't rely on specific representations for signed types.

编辑:为了回答一些评论:

假设我们有一个代码片段,例如:

Let's say we have a code snippet like:

int i = -1;
long j = i;

在对j 的赋值中存在类型转换.假设 intlong 具有不同的大小(大多数 [all?] 64 位系统),i 的内存位置的位模式和 j 会有所不同,因为它们的大小不同.编译器确保ij 的values 是-1.

There is a type conversion in the assignment to j. Assuming that int and long have different sizes (most [all?] 64-bit systems), the bit-patterns at memory locations for i and j are going to be different, because they have different sizes. The compiler makes sure that the values of i and j are -1.

同样,当我们这样做时:

Similarly, when we do:

size_t s = (size_t) -1

正在进行类型转换.-1int 类型.它有一个位模式,但这与本示例无关,因为当转换为 size_t 由于强制转换而发生时,编译器将根据类型的规则(在本例中为 size_t).因此,即使 intsize_t 的大小不同,标准也保证上面存储在 s 中的值将是 size_t 可以承受.

There is a type conversion going on. The -1 is of type int. It has a bit-pattern, but that is irrelevant for this example because when the conversion to size_t takes place due to the cast, the compiler will translate the value according to the rules for the type (size_t in this case). Thus, even if int and size_t have different sizes, the standard guarantees that the value stored in s above will be the maximum value that size_t can take.

如果我们这样做:

long j = LONG_MAX;
int i = j;

如果 LONG_MAX 大于 INT_MAX,则 i 中的值是实现定义的(C89,第 3.2.1.2 节).

If LONG_MAX is greater than INT_MAX, then the value in i is implementation-defined (C89, section 3.2.1.2).

相关文章