在 C++ 中实现高性能顺序文件 I/O 的最快方法是什么?
假设以下...
输出:
文件已打开...
数据流式传输"到磁盘.内存中的数据位于一个大的连续缓冲区中.它直接从该缓冲区以原始形式写入磁盘.缓冲区的大小是可配置的,但在流的持续时间内是固定的.缓冲区被一个接一个地写入文件.不进行搜索操作.
...文件已关闭.
输入:
从磁盘从头到尾读取一个大文件(如上顺序写入).
Input:
A large file (sequentially written as above) is read from disk from beginning to end.
是否有普遍接受的指导方针来在 C++ 中实现最快的顺序文件 I/O?
Are there generally accepted guidelines for achieving the fastest possible sequential file I/O in C++?
一些可能的考虑:
- 选择最佳缓冲区大小的指南
- 像 boost::asio 这样的可移植库是否会过于抽象而无法暴露特定平台的复杂性,还是可以假设它们是最佳的?
- 异步 I/O 总是比同步更可取吗?如果应用程序不受 CPU 限制怎么办?
我意识到这将具有特定于平台的注意事项.我欢迎通用指南以及针对特定平台的指南.
(我对 Win x64 最直接的兴趣,但我也对 Solaris 和 Linux 的评论感兴趣)
I realize that this will have platform-specific considerations. I welcome general guidelines as well as those for particular platforms.
(my most immediate interest in Win x64, but I am interested in comments on Solaris and Linux as well)
推荐答案
是否有普遍接受的指导方针来在 C++ 中实现最快的顺序文件 I/O?
Are there generally accepted guidelines for achieving the fastest possible sequential file I/O in C++?
规则 0:测量.使用所有可用的分析工具并了解它们.这几乎是编程中的一条戒律,如果你不测量它,你不知道它有多快,对于 I/O 来说更是如此.如果可能,请确保在实际工作条件下进行测试.一个对 I/O 系统没有竞争的进程可以被过度优化,针对实际负载下不存在的条件进行微调.
Rule 0: Measure. Use all available profiling tools and get to know them. It's almost a commandment in programming that if you didn't measure it you don't know how fast it is, and for I/O this is even more true. Make sure to test under actual work conditions if you possibly can. A process that has no competition for the I/O system can be over-optimized, fine-tuned for conditions that don't exist under real loads.
使用映射内存而不是写入文件.这并不总是更快,但它允许有机会以特定于操作系统但相对便携的方式优化 I/O,避免不必要的复制,并利用操作系统对磁盘实际使用方式的了解.(可移植",如果您使用包装器,而不是特定于操作系统的 API 调用).
Use mapped memory instead of writing to files. This isn't always faster but it allows the opportunity to optimize the I/O in an operating system-specific but relatively portable way, by avoiding unnecessary copying, and taking advantage of the OS's knowledge of how the disk actually being used. ("Portable" if you use a wrapper, not an OS-specific API call).
尝试尽可能线性化您的输出.在优化条件下,必须绕过内存来查找要写入的缓冲区可能会产生明显的影响,因为缓存行、分页和其他内存子系统问题将开始变得重要.如果您有很多缓冲区,请查看对 scatter-gather I/O 的支持,它会尝试为您进行线性化.
Try and linearize your output as much as possible. Having to jump around memory to find the buffers to write can have noticeable effects under optimized conditions, because cache lines, paging and other memory subsystem issues will start to matter. If you have lots of buffers look into support for scatter-gather I/O which tries to do that linearizing for you.
一些可能的考虑:
- 选择最佳缓冲区大小的指南
初学者的页面大小,但准备好从那里调整.
Page size for starters, but be ready to tune from there.
- 像 boost::asio 这样的可移植库是否会过于抽象而无法暴露其复杂性?特定平台,还是可以假设它们是最佳的?
不要假设它是最佳的.这取决于库在您的平台上的使用程度,以及开发人员为使其快速运行付出了多少努力.话虽如此,便携式 I/O 库可以非常快,因为大多数系统上都存在快速抽象,而且通常可以提出一个涵盖许多基础的通用 API.Boost.Asio 就我有限的知识而言,已经针对它所在的特定平台进行了相当精细的调整:有一整套操作系统和操作系统变体特定 API 用于快速异步 I/O(例如 epoll, /dev/epoll, kqueue, Windows 重叠 I/O),而 Asio 将它们全部包装起来.
Don't assume it's optimal. It depends on how thoroughly the library gets exercised on your platform, and how much effort the developers put into making it fast. Having said that a portable I/O library can be very fast, because fast abstractions exist on most systems, and it's usually possible to come up with a general API that covers a lot of the bases. Boost.Asio is, to the best of my limited knowledge, fairly fine tuned for the particular platform it is on: there's a whole family of OS and OS-variant specific APIs for fast async I/O (e.g. epoll, /dev/epoll, kqueue, Windows overlapped I/O), and Asio wraps them all.
- 异步 I/O 总是比同步更可取吗?如果应用程序不受 CPU 限制怎么办?
在原始意义上,异步 I/O 并不比同步 I/O 快.异步 I/O 的作用是确保您的代码不会浪费时间等待 I/O 完成.一般而言,它比不浪费时间的另一种方法(即使用线程)更快,因为它会在 I/O 准备就绪时而不是之前回调到您的代码中.没有错误启动或需要终止空闲线程的问题.
Asynchronous I/O isn't faster in a raw sense than synchronous I/O. What asynchronous I/O does is ensure that your code is not wasting time waiting for the I/O to complete. It is faster in a general way than the other method of not wasting that time, namely using threads, because it will call back into your code when I/O is ready and not before. There are no false starts or concerns with idle threads needing to be terminated.
相关文章