存储时间序列数据的最佳开源解决方案是什么?

问题描述

我有兴趣监视一些对象.我希望每 15 分钟获得大约 10000 个数据点.(也许一开始不是,但这是一般的球场").我还希望能够获得每日、每周、每月和每年的统计数据.将数据保持最高分辨率(15 分钟)两个月以上并不重要.

I am interested in monitoring some objects. I expect to get about 10000 data points every 15 minutes. (Maybe not at first, but this is the 'general ballpark'). I would also like to be able to get daily, weekly, monthly and yearly statistics. It is not critical to keep the data in the highest resolution (15 minutes) for more than two months.

我正在考虑存储这些数据的各种方法,并且一直在研究经典的关系数据库或无模式数据库(例如 SimpleDB).

I am considering various ways to store this data, and have been looking at a classic relational database, or at a schemaless database (such as SimpleDB).

我的问题是,这样做的最佳方式是什么?我更喜欢开源(免费)解决方案,而不是昂贵的专有解决方案.

My question is, what is the best way to go along doing this? I would very much prefer an open-source (and free) solution to a proprietary costly one.

小记:我正在用 Python 编写这个应用程序.

Small note: I am writing this application in Python.


解决方案

HDF5,可以访问通过 h5py 或 PyTables,专为处理非常大的数据集而设计.两个接口都运行良好.例如,h5py 和 PyTables 都具有自动压缩功能,并且支持 Numpy.

HDF5, which can be accessed through h5py or PyTables, is designed for dealing with very large data sets. Both interfaces work well. For example, both h5py and PyTables have automatic compression and supports Numpy.

相关文章