如何使用 unicode 文件名打开 std::fstream(ofstream 或 ifstream)?
您不会想像使用 C++ 标准库为 Windows 应用程序打开文件这样基本的事情是棘手的......但它似乎是.这里的 Unicode 是指 UTF-8,但我可以转换为 UTF-16 或其他格式,重点是从 Unicode 文件名中获取一个 ofstream 实例.在我修改自己的解决方案之前,这里有首选路线吗?尤其是跨平台的?
You wouldn't imagine something as basic as opening a file using the C++ standard library for a Windows application was tricky ... but it appears to be. By Unicode here I mean UTF-8, but I can convert to UTF-16 or whatever, the point is getting an ofstream instance from a Unicode filename. Before I hack up my own solution, is there a preferred route here ? Especially a cross-platform one ?
推荐答案
C++ 标准库不支持 Unicode.char
和 wchar_t
不需要是 Unicode 编码.
The C++ standard library is not Unicode-aware. char
and wchar_t
are not required to be Unicode encodings.
在 Windows 上,wchar_t
是 UTF-16,但标准库中没有直接支持 UTF-8 文件名(char
数据类型在 Windows 上不是 Unicode)
On Windows, wchar_t
is UTF-16, but there's no direct support for UTF-8 filenames in the standard library (the char
datatype is not Unicode on Windows)
使用 MSVC(以及 Microsoft STL),提供了一个文件流构造函数,它采用 const wchar_t*
文件名,允许您将流创建为:
With MSVC (and thus the Microsoft STL), a constructor for filestreams is provided which takes a const wchar_t*
filename, allowing you to create the stream as:
wchar_t const name[] = L"filename.txt";
std::fstream file(name);
但是,C++11 标准并未指定此重载(它仅保证基于 char
的版本的存在).从版本 g++ 4.8.x 开始,它也没有出现在替代 STL 实现中,例如 GCC 的 libstdc++ for MinGW(-w64).
However, this overload is not specified by the C++11 standard (it only guarantees the presence of the char
based version). It is also not present on alternative STL implementations like GCC's libstdc++ for MinGW(-w64), as of version g++ 4.8.x.
请注意,就像 Windows 上的 char
不是 UTF8,在其他操作系统上 wchar_t
可能不是 UTF16.所以总的来说,这不太可能是便携的.根据标准未定义给定 wchar_t
文件名的流,并且在 char
s 中指定文件名可能很困难,因为 char 使用的编码因操作系统而异
Note that just like char
on Windows is not UTF8, on other OS'es wchar_t
may not be UTF16. So overall, this isn't likely to be portable. Opening a stream given a wchar_t
filename isn't defined according to the standard, and specifying the filename in char
s may be difficult because the encoding used by char varies between OS'es.
相关文章