LSM trees create SSTables fast but need compaction. Learn size-tiered vs leveled compaction strategies and the write vs read amplification tradeoff.
Posts for: #System-Design
LSM Trees vs B-Trees: Write Fast or Read Fast
LSM Trees vs B-Trees: the write-fast or read-fast tradeoff. Learn when to use B-trees (MySQL) vs LSM trees (Cassandra) based on your database workload.
Write-Ahead Logging: How Databases Survive Crashes
How do databases survive crashes and ensure durability? Learn how Write-Ahead Logging (WAL) uses sequential writes to guarantee data persistence without killing performance.
Read Repair and Anti-Entropy: Healing Stale Replicas
How do stale replicas catch up in distributed systems? Compare read repair and anti-entropy strategies with Merkle trees for healing data divergence.
Conflict Resolution: When Two Writes Win
Concurrent writes in distributed databases don’t merge automatically. Learn to detect conflicts with version vectors and resolve them without losing data.
Replication Lag: The Bug That Isn’t a Bug
Users see stale data after writes. It’s not a bug, it’s replication lag. Learn to handle read-after-write problems and causality violations in production.
Consistency Models: What Eventually Means
Eventual consistency doesn’t mean milliseconds. Understand linearizability, causal consistency, and quorum reads to pick the right consistency model.
Secondary Indexes in Distributed Databases
Querying partitioned databases by non-partition keys? Learn the tradeoffs between local and global secondary indexes in distributed systems.
The Hidden Cost of JOINs
Every JOIN multiplies query complexity. Learn the three JOIN strategies databases use and when denormalization beats JOIN performance by 30x.
Indexing Strategies That Actually Work
More indexes don’t mean faster queries. Learn when to add, remove, and optimize database indexes. Real examples of 7x performance gains through strategic indexing.
Virtual Nodes: The Three-Layer Pattern of Consistent Hashing
Understand virtual nodes in consistent hashing through a simple three-layer model that decouples data distribution from server topology in distributed systems.
The Query Optimization Framework
Stop guessing at performance problems. Learn the 5-step systematic framework for debugging slow queries that helped reduce query times from 2+ seconds to 30ms.