为什么除 char 外的内置类型 sizeof 在 C & 中依赖于编译器C++?

2021-12-25 00:00:00 c types c++

为什么 C 和 C++ 中的基本类型不像 Java 中那样严格定义,其中 int 始终为 4 个字节,long 为 8 个字节,等等.据我所知C 和 C++ 仅将 char 定义为 1 个字节,其他所有内容由不同的编译器定义不同.因此,C 和 C++ 中的 int 不一定是 4 个字节,只要它比 short 长并且 short 更长>char.
我只是想知道为什么会这样,它有什么用途吗?

Why are fundamental types in C and C++ not strictly defined like in Java where an int is always 4 bytes and long is 8 bytes, etc. To my knowledge in C and C++ only a char is defined as 1 byte and everything else is defined differently by different compilers. So an int in C and C++ does not have to necessarily be 4 bytes as long as it's longer than a short and short is longer char.
I was just wondering why is that and does it serve any purpose?

推荐答案

原因主要是因为 C 可以移植到更广泛的平台上.不同数据类型在不同平台上具有不同大小的原因有很多,但至少在历史上,int 已被调整为平台的原生字长.在 PDP-11 上它是 16 位(long 最初是为 32 位数字发明的),而一些嵌入式平台编译器甚至有 8 位 ints.当 32 位平台出现并开始使用 32 位 int 时,发明了 short 来表示 16 位数字.

The reason is largely because C is portable to a much wider variety of platforms. There are many reasons why the different data types have turned out to be the various sizes they are on various platforms, but at least historically, int has been adapted to be the platform's native word size. On the PDP-11 it was 16 bits (and long was originally invented for 32-bit numbers), while some embedded platform compilers even have 8-bit ints. When 32-bit platforms came around and started having 32-bit ints, short was invented to represent 16-bit numbers.

如今,大多数 64 位架构使用 32 位 int 只是为了与最初为 32 位平台编写的大量 C 程序兼容,但是已经有 64-具有 64 位 int 的位 C 编译器,尤其是一些早期的 Cray 平台.

Nowadays, most 64-bit architectures use 32-bit ints simply to be compatible with the large base of C programs that were originally written for 32-bit platforms, but there have been 64-bit C compilers with 64-bit ints as well, not least of which some early Cray platforms.

此外,在计算的早期,浮点格式和大小的标准化程度通常要低得多(IEEE 754 直到 1985 年才出现),这就是为什么 floats 和 doubles 的定义甚至不如整数数据类型.他们通常甚至不假设存在诸如无穷大、NaN 或带符号的零之类的特性.

Also, in the earlier days of computing, floating-point formats and sizes were generally far less standardized (IEEE 754 didn't come around until 1985), which is why floats and doubles are even less well-defined than the integer data types. They generally don't even presume the presence of such peculiarities as infinities, NaNs or signed zeroes.

此外,也许应该说 char 不是定义为 1 个字节,而是定义为 sizeof 返回 1 的任何内容.这不一定是8位.(为了完整起见,也许还应该在此处添加byte"作为术语是没有普遍定义为 8 位;它的历史定义有很多,在 ANSI C 标准的上下文中,字节"实际上被定义为可以存储 char<的最小存储单位/code>,无论 char 的性质如何.)

Furthermore, it should perhaps be said that a char is not defined to be 1 byte, but rather to be whatever sizeof returns 1 for. Which is not necessarily 8 bits. (For completeness, it should perhaps be added here, also, that "byte" as a term is not universally defined to be 8 bits; there have been many historical definitions of it, and in the context of the ANSI C standard, a "byte" is actually defined to be the smallest unit of storage that can store a char, whatever the nature of char.)

也有运行 C 程序的 36 位 PDP-10 和 18 位 PDP-7 等架构.它们现在可能非常少见,但确实有助于解释为什么 C 数据类型不是根据 8 位单位定义的.

There are also such architectures as the 36-bit PDP-10s and 18-bit PDP-7s that have also run C programs. They may be quite rare these days, but do help explain why C data types are not defined in terms of 8-bit units.

这最终是否真的使该语言比 Java 等语言更具可移植性"可能存在争议,但在 16 位处理器上运行 Java 程序肯定是次优的,而且在 36 位处理器上确实很奇怪.位处理器.可以公平地说,它使 语言 更具可移植性,但用它编写的程序可移植性较差.

Whether this, in the end, really makes the language "more portable" than languages like Java can perhaps be debated, but it would sure be suboptimal to run Java programs on 16-bit processors, and quite weird indeed on 36-bit processors. It is perhaps fair to say that it makes the language more portable, but programs written in it less portable.

在回复一些评论时,我只想补充一点,作为一种意见,C 作为一种语言与 Java/Haskell/ADA 等语言不同- 较少由公司或标准机构拥有".当然有 ANSI C,但 C 不仅仅是 ANSI C;它是一个活生生的社区,并且有许多与 ANSI 不兼容但仍然是C"的实现.争论使用 8 位 int 的实现是否是 C 类似于争论 Scots 是否是英语,因为它几乎毫无意义.他们使用 8 位 ints 是有充分理由的,任何对 C 足够了解的人都无法推理为此类编译器编写的程序,并且任何为此类架构编写 C 程序的人都希望他们的 ints 为 8 位.

In reply to some of the comments, I just want to append, as an opinion piece, that C as a language is unlike languages like Java/Haskell/ADA that are more-or-less "owned" by a corporation or standards body. There is ANSI C, sure, but C is more than ANSI C; it's a living community, and there are many implementations that aren't ANSI-compatible but are "C" nevertheless. Arguing whether implementations that use 8-bit ints are C is similar to arguing whether Scots is English in that it's mostly pointless. They use 8-bit ints for good reasons, noone who knows C well enough would be unable to reason about programs written for such compilers, and anyone who writes C programs for such architectures would want their ints to be 8 bits.

相关文章