A cross-border community for researchers with openness, equality and inclusion

ABSTRACT LIBRARY

Distributed key value store

Publisher: IEEE

Authors: K Agilan, StudentHerbert Rithin, SRM Institute of Science and Technology *

  • Favorite
  • Share:

Abstract:

Distributed Key-Value Stores: Hashing, Sharding,

Replication and Fault Tolerance


 Jai Ganesh , Agilan, Rithin Herbert


Department of Computer science and engineering


SRM Institute of Science and Technology


Tiruchirappalli, India


Dr S P Ramya


School of Computing


SRM Institute of Science and Technology


Tiruchirappalli, India


Abstract—Abstract—This paper provides an in-depth review of


distributed key-value store systems, which are the new building blocks


of modern cloud platforms, large-scale web services and data-


intensive applications. These systems are designed to store very large


and continuously growing data sets while providing low-latency access


to the data, horizontal scalability, and high availability. The focus of


the review is on the four main architectural components that make


these distributed key-value stores work correctly across many nodes


in a networked system. First we describe hashing mechanisms, and


consistent hashing in particular, which allows for balanced data


distribution while minimizing the overall effort of distributing the data


as nodes join and leave the system. We also cover sharding


techniques, which subdivides the data across nodes for performance


improvements and reduced storage contention. Next, we discuss data


replication mechanisms which provide reliability and prevent data loss


in case of failure. Finally, we address fault tolerance mechanisms


which provide for continued functioning of the system, even when


nodes fail, or network segments fail. The major theoretical aspects are


described and considered, along with the trade-offs involved in trying


to maintain the balance between consistency, availability and system


performance. Current issues and directions for future research are


noted especially with respect to dynamically scaling, dealing with


skewed workloads, and providing stronger global consistency model


guarantees.


Index Terms—distributed key value store, consistent hashing,


sharding, replication, fault tolerance, distributed systems, NoSQL


Index Terms—distributed key-value store, consistent hashing,


sharding, replication, fault tolerance, NoSQL, distributed systems


I. INTRODUCTION


The fast pace of growth in large-scale data systems deployed


in cloud platforms, social networks, e-commerce applications,


and real-time analytics services has greatly increased the


dependence on distributed key value stores (KVS). While KVS


stores differ from relational databases in many ways, their key


differentiating features are the ability to achieve horizontal


scalability, low-latency data access, and flexible (schema-free)


data organization. Distributed key value stores are designed to


manage massive and rapidly evolving datasets that cannot be


accommodated by single-node architectures. The purpose of


this review is to examine the core architectural and algorithmic


primitives that allow distributed key value stores to be used in


demanding environments. These primitives consist of key


distribution mechanisms based on hashing, dataset


partitioning techniques such as sharding, replication design


methodologies for fault-tolerance and durability, and backing


stores based on filesystems, and the networking


considerations required to maintain performance across


geographically distributed clusters. The discussion draws upon


both foundational research concepts and practical


implementations used in widely adopted systems such as


Amazon Dynamo, Apache Cassandra, and Redis Cluster,


thereby providing a balanced perspective that links theoretical


design principles with real-world system behavior.


The remainder of this paper is structured as follows. Section


II introduces the key value data model and the primary


requirements that motivate distributed deployment. Section III


explains consistent hashing and its role in scalable and stable


key distribution. Section IV describes sharding methods and


strategies for dynamic node assignment. Section V examines


the replication policies, consistency guarantees, and fault


tolerance techniques that maintain reliability. Section VI


explores networking challenges, system bottlenecks, and


performance trade-offs. Section VII discusses current research


challenges and emerging future directions. Finally, Section VIII


presents the conclusion.

Keywords: Stimulation Paradigm

Published in: 2024 Asian Conference on Communication and Networks (ASIANComNet)

Date of Publication: --

DOI: -

Publisher: IEEE