std::string.c_str() 使用什么编码?

2021-12-28 00:00:00 string utf-8 c++

我正在尝试将 C++ std::string 转换为 UTF-8 或 std::wstring 而不丢失信息(考虑包含非 ASCII 字符的字符串).

I am trying to convert a C++ std::string to UTF-8 or std::wstring without losing information (consider a string that contains non-ASCII characters).

根据 http://forums.sun.com/thread.jspa?threadID=486770&forumID=31:

如果 std::string 包含非 ASCII 字符,您必须提供一个函数将您的编码转换为 UTF-8 [...]

If the std::string has non-ASCII characters, you must provide a function that converts from your encoding to UTF-8 [...]

std::string.c_str() 使用什么编码?如何以跨平台方式将其转换为 UTF-8 或 std::wstring?

What encoding does std::string.c_str() use? How can I convert it to UTF-8 or std::wstring in a cross-platform fashion?

推荐答案

std::string 本身不使用编码――它将返回您放入其中的字节.例如,这些字节可能使用 ISO-8859-1 编码……或任何其他编码,实际上:关于编码的信息不存在――您必须知道这些字节来自哪里!

std::string per se uses no encoding -- it will return the bytes you put in it. For example, those bytes might be using ISO-8859-1 encoding... or any other, really: the information about the encoding is just not there -- you have to know where the bytes were coming from!

相关文章