使用 std::ios_base::binary 有什么意义?

2022-01-07 00:00:00 iostream eol c++ stl

我在 Window 下读取 Linux 文件时遇到问题.以下是问题讨论:使用 fstream::seekg 在 windows 下,在 Unix 下创建的文件上.

I had a issue with Linux file reading under Window. Here is the issue discussion: Using fstream::seekg under windows on a file created under Unix.

该问题已通过打开指定了 std::ios_base::binary 的 text 文件解决.

The issue was workarounded by opening the text file with std::ios_base::binary specified.

但是这种模式的实际意义是什么?如果指定,您仍然可以将文件作为文本文件处理(使用 mystream <<"Hello World" << std::endl 编写并使用 std::getline).

But what's the actual point with this mode? If specified, you can still work with your file as a text file (writting with mystream << "Hello World" << std::endl and reading with std::getline).

在 Windows 下,唯一的区别,我能注意到的是 mystream <<你好世界"<<std::endl 使用:

Under Windows, the only difference, I could notice is that mystream << "Hello World" << std::endl uses:

  • 0x0D 0x0A 作为行分隔符,如果 std::ios_base::binary 未指定(EOL 和回车)
  • 0x0A 作为行分隔符,如果指定了 std::ios_base::binary(仅限 EOL)
  • 0x0D 0x0A as line separator if std::ios_base::binary was not specified (EOL and carriage return)
  • 0x0A as line separator if std::ios_base::binary was specified (EOL only)

在打开使用 std::ios_base::binary 生成的文件时,记事本不会巧妙地显示行.vi 或写字板等更好的编辑器确实会显示它们.

Notepad does not smartly show lines when opening the files generated with std::ios_base::binary. Better editors like vi or Wordpad does show them.

这真的是使用和不使用 std::ios_base::binary 生成的文件之间的唯一区别吗?文档说将流视为二进制而不是文本.,这到底是什么意思?

Is that really the only difference there is between files generated with and without std::ios_base::binary? Documentation says Consider stream as binary rather than text., what does this mean in the end?

如果我不关心在记事本中打开文件并且想要 fstream::seekg,总是设置 std::ios_base::binary 是否安全总是工作?

Is it safe to always set std::ios_base::binary if I don't care about opeing the file in Notepad and want to have fstream::seekg always work?

推荐答案

二进制和文本模式的区别在于实现定义,但只涉及最低级别:它们不会改变<<>>(插入和提取文本数据).此外,正式地,输出除了一些不可打印的如果文件是文本文件,字符(如 ' ')是未定义的行为模式.

The differences between binary and text modes are implementation defined, but only concern the lowest level: they do not change the meaning of things like << and >> (which insert and extract textual data). Also, formally, outputting all but a few non-printable characters (like ' ') is undefined behavior if the file is in text mode.

对于最常见的操作系统:Unix下,没有区别;两者都是完全相同的.Windows下,' '在内部会映射到两个字符序列 CR, LF (0x0D, 0x0A) 外部,0x1A 将是读取时解释为文件结尾.在更具异国情调的(而且大多是已灭绝)操作系统,但是,它们可以用完全不同的方式表示操作系统级别的文件类型,并且可能无法读取文件文本模式,如果它是以二进制模式编写的,反之亦然.或者您可以看到不同的东西:行尾的额外空白,或没有 ' ' 二进制模式.

For the most common OSs: under Unix, there is no distinction; both are identical. Under Windows, ' ' internally will be mapped to the two character sequence CR, LF (0x0D, 0x0A) externally, and 0x1A will be interpreted as an end of file when reading. In more exotic (and mostly extinct) OSs, however, they could be represented by entirely different file types at the OS level, and it could be impossible to read a file in text mode if it were written in binary mode, and vice versa. Or you could see something different: extra white space at the end of line, or no ' ' in binary mode.

关于始终设置 std::ios_base::binary:我的政策可移植文件是确定我希望它们如何格式化,设置二进制,并输出我想要的.这通常是 CR、LF,而不仅仅是LF,因为那是网络标准.另一方面,大多数Windows 程序只有 LF 没有问题,但我遇到过多个存在 CR、LF 问题的 Unix 程序;哪个主张系统地仅使用 LF(这也更容易).正在做这样的事情意味着我得到相同的结果,无论是否我在 Unix 或 Windows 下运行.

With regards to always setting std::ios_base::binary: my policy for portable files is to decide exactly how I want them formatted, set binary, and output what I want. Which is often CR, LF, rather than just LF, since that's the network standard. On the other hand, most Windows programs have no problems with just LF, but I've encountered more than a few Unix programs which have problems with CR, LF; which argues for systematically using just LF (which is easier, too). Doing things this way means that I get the same results regardless of whether I'm running under Unix or under Windows.

相关文章