如何将 std::string 写入 UTF-8 文本文件

2021-12-28 00:00:00 utf-8 c++

我只想用 C++ 将一些简单的行写入文本文件,但我希望它们以 UTF-8 编码.最简单的方法是什么?

I just want to write some few simple lines to a text file in C++, but I want them to be encoded in UTF-8. What is the easiest and simple way to do so?

推荐答案

UTF-8 影响 std::string 的唯一方式是 size(), length(),所有索引都以字节为单位,而不是字符.

The only way UTF-8 affects std::string is that size(), length(), and all the indices are measured in bytes, not characters.

而且,正如 sbi 指出的那样,递增 std::string 提供的迭代器将按字节而不是按字符向前推进,因此它实际上可以指向多字节 UTF-8 的中间代码点.标准库中没有提供支持 UTF-8 的迭代器,但在网络"上提供了一些.

And, as sbi points out, incrementing the iterator provided by std::string will step forward by byte, not by character, so it can actually point into the middle of a multibyte UTF-8 codepoint. There's no UTF-8-aware iterator provided in the standard library, but there are a few available on the 'Net.

如果你记得的话,你可以将 UTF-8 放入 std::string,以通常的方式将其写入文件等(我的意思是你的方式)使用 std::string 里面没有 UTF-8).

If you remember that, you can put UTF-8 into std::string, write it to a file, etc. all in the usual way (by which I mean the way you'd use a std::string without UTF-8 inside).

您可能希望以字节顺序标记开始您的文件,以便其他程序知道它是 UTF-8.

You may want to start your file with a byte order mark so that other programs will know it is UTF-8.

相关文章