How to partition database across multiple servers. Hash-based vs range-based sharding, rebalancing strategies, and the complexity that comes with it.
Posts for: #System-Design
Rate Limiting: Token Bucket vs Leaky Bucket
Protecting services from overload with rate limiting. Token bucket and leaky bucket algorithms explained with Java implementations and real-world trade-offs.
Backpressure: When Consumers Can’t Keep Up
Handling slow consumers in distributed systems. Queue growth, memory exhaustion, and strategies for applying backpressure - rejection, rate limiting, and flow control.
Retry Strategies: Exponential Backoff and Jitter
How to retry failed requests without overwhelming servers. Exponential backoff, jitter, and when to give up. Java implementations and real-world patterns.
Idempotency: Why Retries Need It
How to make operations safe to retry. Idempotency keys, database patterns, and why retrying non-idempotent operations causes data corruption.
Session Guarantees: The Promises Your Database Makes to You
Read-your-writes and monotonic reads aren’t just buzzwords. They’re the difference between a database that feels broken and one that makes sense to users.
Horizontal vs Vertical Scaling: Bigger Machine or More Machines
Comparing vertical scaling (scale up) and horizontal scaling (scale out). When to use each, trade-offs, and the complexity that comes with horizontal scaling.
Load Balancing Strategies: Picking the Right Server
Comparing load balancing algorithms - Round Robin, Least Connections, Weighted Round Robin, and IP Hash. Java implementations and real-world trade-offs.
Bloom Filters: Definitely Not Here
Bloom filters skip unnecessary disk reads in LSM trees by saying ‘definitely not here’ with zero false negatives. Learn how Cassandra and RocksDB use them.
Compaction Strategies: Cleaning Up After LSM Trees
LSM trees create SSTables fast but need compaction. Learn size-tiered vs leveled compaction strategies and the write vs read amplification tradeoff.
LSM Trees vs B-Trees: Write Fast or Read Fast
LSM Trees vs B-Trees: the write-fast or read-fast tradeoff. Learn when to use B-trees (MySQL) vs LSM trees (Cassandra) based on your database workload.
Write-Ahead Logging: How Databases Survive Crashes
How do databases survive crashes and ensure durability? Learn how Write-Ahead Logging (WAL) uses sequential writes to guarantee data persistence without killing performance.
Read Repair and Anti-Entropy: Healing Stale Replicas
How do stale replicas catch up in distributed systems? Compare read repair and anti-entropy strategies with Merkle trees for healing data divergence.
Conflict Resolution: When Two Writes Win
Concurrent writes in distributed databases don’t merge automatically. Learn to detect conflicts with version vectors and resolve them without losing data.
Replication Lag: The Bug That Isn’t a Bug
Users see stale data after writes. It’s not a bug, it’s replication lag. Learn to handle read-after-write problems and causality violations in production.
Consistency Models: What Eventually Means
Eventual consistency doesn’t mean milliseconds. Understand linearizability, causal consistency, and quorum reads to pick the right consistency model.
Secondary Indexes in Distributed Databases
Querying partitioned databases by non-partition keys? Learn the tradeoffs between local and global secondary indexes in distributed systems.
The Hidden Cost of JOINs
Every JOIN multiplies query complexity. Learn the three JOIN strategies databases use and when denormalization beats JOIN performance by 30x.
Indexing Strategies That Actually Work
More indexes don’t mean faster queries. Learn when to add, remove, and optimize database indexes. Real examples of 7x performance gains through strategic indexing.
Virtual Nodes: The Three-Layer Pattern of Consistent Hashing
Understand virtual nodes in consistent hashing through a simple three-layer model that decouples data distribution from server topology in distributed systems.
The Query Optimization Framework
Stop guessing at performance problems. Learn the 5-step systematic framework for debugging slow queries that helped reduce query times from 2+ seconds to 30ms.