为日志而生的存储引擎

2020-06-01 00:00:00 专区订阅付费希望建立一个

详细设计文档：Documentation · Issue #1 · esdb/lstore

A log storage engine, that supports

Multi-dimensional search (like Elasticsearch, but optimized for append-only log)
Conditional log append

It is trivial to achieve both goals by storing the log into a relational database, or a key-value store, or a search engine. However, generic storage system is not cost efficient to store log of immutable entries. If we only support "append-only" data, a great deal of optimization is possible. The goal of lstore is to be more than 10x cost efficient than traditional storage systems (still working on that).

Conditional log append is a crucial feature to make lstore reusable in a wide range of scenarios. Any kind of optimistic locking relies on some sort of conditional log append. It is like the lock cmpxchg instruction for the distributed world.

A high level summary of the storage format is "bloomfilter tree of log blocks". Majorities of storage systems built on top of sorting. They works like disk backed tree. However, lstore is built on top of hashing. It works like disk backed hashmap. The architecture is similar with bitcask. But unlike bitcask, lstore is a low level append-only log store, instead of mutable key-value store.

List of optimizations:

Bloom filter tree, which enables multi-dimensional search without sorting
Min-max index (TODO: to support range query)
Off-heap memory management: map file system cache directly as go object
Columnar data block, to scan less data
Columnar data block enables linear searching with SIMD instruction
Columnar data block enables searchable compression (search without decompression first)

Dictionary Encoding (TODO)
Sparse Column (TODO)
FOR integer compression (TODO)

相关文章