Design and Implementation of a Scalable Distributed Database System with LSM-tree Storage and Consistent Hashing
Time: 01 Jan 1970, 08:00
Session: [S1] Day-1 (06/12/2025) » [S1-2] Technical Sessions 1
Type: Oral Presentation
Abstract:
This paper explains the full scope, the intricate details, and the result of the evaluation of a novel distributed database system that combines a Log-Structured Merge-tree (LSM-tree) storage structure with consistent hashing for the best data distribution and retrieval in a way that they complement each other. To begin with, the system that we have designed addresses the problems that have become the major challenges of the data-intensive applications that consume large amounts of data. It supports a great throughput of write operations while also preserving a strong read performance and fault tolerance. The architecture utilizes a fundamentally new ring-based topology with the automatic replication of data, the efficient mechanisms of secondary indexing, and the advanced memory management with the help of the intelligent memtable flushing. The result of a broad range of experiments confirms the high performance of our system in that it can achieve 45,231 operations/second write throughput, and this is a 17.6% betterment of Apache Cassandra and a 266% betterment of MongoDB. Under the test environment, the system shows linear scalability up to 12 nodes, and it is consistent that sub-millisecond latency is achieved for 95% of the read operations.
Keywords:
database,lsm tree,secondary indexing,hashing,replication
Speaker: