字节序何时成为一个因素?

2021-12-20 00:00:00 networking c++ stl endianness

据我所知，字节序是指组成多字节字的字节顺序不同，至少在最典型的情况下是这样.这样一个 16 位整数可以存储为 0xHHLL 或 0xLLHH.

Endianness from what I understand, is when the bytes that compose a multibyte word differ in their order, at least in the most typical case. So that an 16-bit integer may be stored as either 0xHHLL or 0xLLHH.

假设我没有错，我想知道的是，在 Endian 可能不同也可能不同的两台计算机之间发送信息时，Endianness 何时成为主要因素.

Assuming I don't have that wrong, what I would like to know is when does Endianness become a major factor when sending information between two computers where the Endian may or may not be different.

如果我以 char 数组的形式传输一个短整数 1，并且没有更正，它是否被接收并解释为 256?

If I transmit a short integer of 1, in the form of a char array and with no correction, is it received and interpretted as 256?

如果我使用以下代码分解并重新组合短整数，字节序将不再是一个因素吗?

If I decompose and recompose the short integer using the following code, will endianness no longer be a factor?

// Sender: for(n=0, n < sizeof(uint16)*8; ++n) { stl_bitset[n] = (value >> n) & 1; }; // Receiver: for(n=0, n < sizeof(uint16)*8; ++n) { value |= uint16(stl_bitset[n] & 1) << n; };

是否有一种标准的方式来补偿字节序?

提前致谢！

推荐答案

非常抽象地说，字节序是将变量重新解释为字符数组的属性.

Very abstractly speaking, endianness is a property of the reinterpretation of a variable as a char-array.

实际上，当您从外部字节流(如文件或套接字)read() 和 write() 到外部字节流时，这很重要.或者，再次抽象地说，当您序列化数据时，字节序很重要(主要是因为序列化数据没有类型系统，仅由哑字节组成)；和字节序在你的编程语言中不重要，因为该语言只对值进行操作，而不对表示进行操作.从一个到另一个是你需要深入研究细节的地方.

Practically, this matters precisely when you read() from and write() to an external byte stream (like a file or a socket). Or, speaking abstractly again, endianness matters when you serialize data (essentially because serialized data has no type system and just consists of dumb bytes); and endianness does not matter within your programming language, because the language only operates on values, not on representations. Going from one to the other is where you need to dig into the details.

机智 - 写作:

uint32_t n = get_number(); unsigned char bytesLE[4] = { n, n >> 8, n >> 16, n >> 24 }; // little-endian order unsigned char bytesBE[4] = { n >> 24, n >> 16, n >> 8, n }; // big-endian order write(bytes..., 4);

这里我们可以说，reinterpret_cast(&n)，结果将取决于系统的字节序.

Here we could just have said, reinterpret_cast<unsigned char *>(&n), and the result would have depended on the endianness of the system.

阅读:

unsigned char buf[4] = read_data(); uint32_t n_LE = buf[0] + buf[1] << 8 + buf[2] << 16 + buf[3] << 24; // little-endian uint32_t n_BE = buf[3] + buf[2] << 8 + buf[1] << 16 + buf[0] << 24; // big-endian

同样，这里我们可以说，uint32_t n = *reinterpret_cast(buf)，结果将取决于机器字节序.

Again, here we could have said, uint32_t n = *reinterpret_cast<uint32_t*>(buf), and the result would have depended on the machine endianness.

如您所见，如果您使用代数输入和输出操作，对于整数类型，您永远不必知道自己系统的字节序，只需要知道数据流的字节序.对于其他数据类型，例如 double，问题会更加复杂.

As you can see, with integral types you never have to know the endianness of your own system, only of the data stream, if you use algebraic input and output operations. With other data types such as double, the issue is more complicated.

相关文章