Posts for: #Database

Tenant-Aware Data Partitioning

2026-05-07sohilladhani

#distributed-systems #system-design #architecture #database

You shard your database to scale. You pick a shard key. If you pick something unrelated to tenant, queries for one tenant’s data scatter across all shards. If you pick tenant ID, all of one tenant’s data lands on one shard, and a large tenant can overwhelm it. Why Tenant ID Makes Sense as a Shard Key Tenant isolation is the priority in a multi-tenant system. If all of Tenant A’s data is on Shard 2, a query for Tenant A’s records goes to Shard 2 only.

Multi-Tenancy Patterns

2026-05-05sohilladhani

#distributed-systems #system-design #architecture #database

You’re building a SaaS product. Do you give each customer their own database? Put everyone in one? Somewhere in between? The answer affects cost, isolation, compliance, and how much operational pain you take on for the life of the product. The Three Models Shared database, shared schema: all tenants in the same tables, with a tenant_id column. One database to manage. Lowest cost. The risk: a bug that forgets the tenant_id filter leaks one customer’s data to another.

Hinted Handoff

2026-04-30sohilladhani

#distributed-systems #system-design #database #consistency #architecture

Node 3 is down. A write comes in that belongs there. You could reject it. Or you could accept it, hold it somewhere safe, and deliver it when Node 3 comes back. What Hinted Handoff Does In a distributed database with replication, each write goes to a coordinator node, which forwards it to the nodes that own the data. If an owner is unreachable, the coordinator stores the write temporarily with a hint: “this write is intended for Node 3.

Column-Family Storage

2026-04-29sohilladhani

#distributed-systems #system-design #architecture #database

Your query is always “give me all events for user X, sorted by time.” A row-oriented database gives you rows where you pay for every column you didn’t ask for. Wide-column stores flip the model: you design the schema around your query, not the other way around. How It Works In a wide-column store like Cassandra or HBase, the primary key has two parts: the partition key and the clustering key.

Inverted Indexes: How Search Actually Works

2026-03-05sohilladhani

#data-structures #system-design #java #database #distributed-systems

A normal index maps documents to words. An inverted index maps words to documents. That reversal is why search is fast.

Optimistic vs Pessimistic Concurrency: Locks vs Versions

2026-02-27sohilladhani

#distributed-systems #database #system-design #java #mysql

Two users update the same row. Pessimistic locking blocks one until the other finishes. Optimistic locking lets both try and fails the loser. Choosing wrong kills either throughput or correctness.

Two-Phase Commit: The Original Distributed Transaction

2026-02-26sohilladhani

#distributed-systems #system-design #architecture #java #database

Two-phase commit guarantees atomicity across multiple databases. It also blocks everything if the coordinator dies. Here’s why microservices moved on.

Distributed ID Generation: Snowflake and Friends

2026-02-21sohilladhani

#distributed-systems #system-design #architecture #java #database

Auto-increment IDs break the moment you have more than one database. Snowflake IDs, UUIDs, and database sequences each solve this differently.

Social Graphs at Scale: Storing Relationships in MySQL

2026-02-19sohilladhani

#mysql #database #system-design #architecture #performance

A follows table with two columns seems trivial. Until you need to query it from both directions, across shards, for millions of users.

Cursor-Based Pagination: Why Offset Breaks at Scale

2026-02-15sohilladhani

#mysql #database #performance #system-design #java

OFFSET 50000 makes MySQL scan 50,000 rows just to skip them. Cursor pagination stays fast no matter how deep you go.

Read Replicas: Hidden Consistency Traps

2026-02-12sohilladhani

#mysql #database #replication #consistency #system-design

You added read replicas to scale reads. Now users update their profile and see the old version. Welcome to replica lag.

Database Migrations Without Downtime

2026-02-09sohilladhani

#mysql #database #system-design #deployment #architecture

ALTER TABLE on a 2M row table locks it for minutes. Your users see errors. Here’s how expand-contract and shadow writes let you migrate without downtime.

Connection Pooling: Why Opening Connections Is Expensive

2026-01-31sohilladhani

#performance #database #connection-pooling #java #system-design

The hidden cost of database connections. How connection pools work, why they matter, and how to size them without guessing.

The CAP Theorem: The Cliché I Tried to Avoid

2026-01-21sohilladhani

#distributed-systems #cap-theorem #database #system-design #architecture

Why the CAP Theorem is the most misunderstood rule in system design. Addressing the ‘Pick 2’ lie and how it sets the stage for consensus algorithms.

Materialized Views: The Read Optimization Pattern

2026-01-18sohilladhani

#distributed-systems #database #performance #cqrs #system-design

Why standard views are just aliases and how materialized views act as an ‘in-database cache’ to solve the cross-shard query problem.

Change Data Capture: Streaming Database Changes

2026-01-14sohilladhani

#database #cdc #streaming #event-driven #system-design

How to capture and stream database changes in real-time. CDC patterns, implementation approaches, and when to use it instead of application-level events.

Database Sharding: Splitting Data Across Machines

2026-01-12sohilladhani

#distributed-systems #database #sharding #partitioning #system-design

How to partition database across multiple servers. Hash-based vs range-based sharding, rebalancing strategies, and the complexity that comes with it.

Bloom Filters: Definitely Not Here

2026-01-03sohilladhani

#database #bloom-filters #data-structures #system-design

Bloom filters skip unnecessary disk reads in LSM trees by saying ‘definitely not here’ with zero false negatives. Learn how Cassandra and RocksDB use them.

Compaction Strategies: Cleaning Up After LSM Trees

2026-01-02sohilladhani

#database #lsm-trees #compaction #system-design

LSM trees create SSTables fast but need compaction. Learn size-tiered vs leveled compaction strategies and the write vs read amplification tradeoff.

LSM Trees vs B-Trees: Write Fast or Read Fast

2026-01-01sohilladhani

#database #data-structures #storage #system-design

LSM Trees vs B-Trees: the write-fast or read-fast tradeoff. Learn when to use B-trees (MySQL) vs LSM trees (Cassandra) based on your database workload.

Write-Ahead Logging: How Databases Survive Crashes

2025-12-31sohilladhani

#database #durability #wal #system-design

How do databases survive crashes and ensure durability? Learn how Write-Ahead Logging (WAL) uses sequential writes to guarantee data persistence without killing performance.

Query Execution Plans: Reading EXPLAIN Like a Map

2025-12-26sohilladhani

#database #mysql #performance #explain

Stop staring at EXPLAIN output confused. Learn to read MySQL execution plans like a map and find the root cause of slow queries in seconds, not hours.

Secondary Indexes in Distributed Databases

2025-12-25sohilladhani

#distributed-systems #database #partitioning #system-design

Querying partitioned databases by non-partition keys? Learn the tradeoffs between local and global secondary indexes in distributed systems.

The Hidden Cost of JOINs

2025-12-24sohilladhani

#database #performance #sql #system-design

Every JOIN multiplies query complexity. Learn the three JOIN strategies databases use and when denormalization beats JOIN performance by 30x.

Indexing Strategies That Actually Work

2025-12-23sohilladhani

#database #indexing #performance #system-design

More indexes don’t mean faster queries. Learn when to add, remove, and optimize database indexes. Real examples of 7x performance gains through strategic indexing.

The Query Optimization Framework

2025-12-21sohilladhani

#database #performance #optimization #system-design

Stop guessing at performance problems. Learn the 5-step systematic framework for debugging slow queries that helped reduce query times from 2+ seconds to 30ms.