LotusDB It's a brand new KV Storage engine ,Github Address ： https://github.com/flower-corp/lotusdb, I hope you can give me more support , Order one star Or get involved ！
LotusDB It's based on LSM Tree Design , And combine B+ Stand alone with tree advantage KV Storage engine , Stable reading and writing performance 、 Fast .
In traditional LSM Tree Architecture , The addition and deletion data are added and written to SST In file , same key The corresponding data may exist in multiple copies , Need to go through complex compaction Strategy to reclaim space , This brings the problems of space amplification and write amplification at the same time .
LSM Tree Maintain multiple levels on disk SSTable file , When reading data , You need to scan the file layer by layer to find the specified data , In the worst case, you need to scan every layer SSTable, Unstable read performance .
and LSM Tree Corresponding , Another common data storage model is B+ Tree,B+ Because the tree has the characteristics of good adaptation to disk pages , It is widely used in database storage engine , For example, the most well-known Mysql Of InnoDB engine .
B+ Tree Maintain the data in the lowest leaf node of the tree , The reading performance is relatively stable , However, the insertion and update of data are random IO To write , Lead to B+ Tree The write performance of is relatively low .
We know ,LSM The storage model was born in HDD（ Mechanical drive ） Time ,HDD There is a huge difference between random and sequential reading and writing speeds , therefore LSM The design maximizes the order IO The advantages of , All data comes to memory first buffer Inner buffer , Then batch and orderly write to the file . But with the update of storage hardware , The difference between random and sequential reading and writing of disks is reduced , In some media , There is not even much difference between sequential and random reading and writing .
LSM Tree For order IO Some designs are too complex , It makes the whole system difficult to realize and control （ If you are familiar with rocksdb Words , You'll feel it ）.
Design a system's underlying storage engine , It's easier than mastering a complex project , When relevant problems arise, it is easier to locate and solve , That's why cockroach Using self-developed Pebble Storage engines replace rocksdb, and LotusDB It is a storage engine that can be easily learned and mastered , Because it's simple 、 Intuitive and efficient .
LotusDB The overall architecture is as follows ：
LotusDB It's still there LSM Tree The process of writing in , Because this can ensure the persistence of write data and write throughput to the greatest extent , So I maintained on disk WAL journal , The newly written data is appended to WAL Ensure that the data is not lost , Then write to memory .
Multiple jump table structures are maintained in memory , The latest jump table is called active memtable, One memtable When it's full , Will turn into immutable memtable, It's immutable memtable, It cannot receive new writes , And wait for the background thread flush To disk .
Flush When , The data index information will be stored in B+ In the tree , and value Will be stored separately in Value Log in ,value log The structure of is similar to WAL, The data is written by log append , It's just value log There will be a threshold , When it's full, it opens a new value log, therefore value log There are many .
It should be noted that ,B+ Tree Try to store on new media , For example, solid state drives , Because I mentioned before B+ The tree is written randomly , If you use a traditional mechanical hard disk , Write performance is limited , Write amplification is serious ,Flush It may be a bottleneck .
This is it. LotusDB The overall realization of , In this realization , Let's take a look at the basic data reading and writing process .
Write a key/value： As I said before , and LSM The model is completely consistent , First the key/value Encapsulate it into a log and append it to WAL in , And then k/v Write to memory active memtable.
according to key Read a value： First in memory active memtable and immutable memtable Find... In turn , If you find a direct return . Otherwise, explain value May be on disk , From B+ Tree acquisition key Index information of , The index information is a binary <fid, offset>, identification value be located value log Which specific file in the , And the location in the file , Then directly according to this index information to value log Get in file value that will do .
Finally, let's summarize LotusDB Advantages of Architecture , A brief summary is as follows ：
1、 Write data flow and traditional LSM The model is completely consistent , The order is guaranteed IO High throughput , And data persistence
2、 Read performance compared to native LSM The model is more stable , Read amplification is reduced , Because the introduction of B+ Trees , Thanks to the B+ Tree stable read performance , The overall reading efficiency will be more controllable
3、 Completely removed LSM Tree Multilevel in the model SSTable, period SSTable The maintenance of the , And use the existing B+ Tree implementation （BoltDB）, Greatly reduce the complexity of the system
4、Compaction Reduce the loss of storage media ,LotusDB There are only value log There is Compaction; Native LSM Not only SSTable need Compaction, And if kv If separated ,value log It also needs to be Compaction
5、 The reading and writing process is simple and intuitive , No, bloom filter、block cache etc.
LotusDB Github Address ：https://github.com/flower-corp/lotusdb