Hinted Handoff

Node 3 is down. A write comes in that belongs there. You could reject it. Or you could accept it, hold it somewhere safe, and deliver it when Node 3 comes back.

What Hinted Handoff Does#

In a distributed database with replication, each write goes to a coordinator node, which forwards it to the nodes that own the data. If an owner is unreachable, the coordinator stores the write temporarily with a hint: “this write is intended for Node 3.”

When Node 3 recovers, the coordinator delivers the queued hints. The node replays them and catches up on what it missed. From the client’s perspective, the write succeeded. From Node 3’s perspective, it received the data eventually.

This is how Cassandra maintains write availability even when a replica is down. You don’t need all replicas up to accept a write.

graph TD A[Client write: user 42, event X] --> B[Coordinator Node] B --> C[Node 1 - write succeeds] B --> D[Node 2 - write succeeds] B --> E{Node 3 reachable?} E -->|Down| F[Store hint locally: intended for Node 3] E -->|Recovered| G[Deliver stored hints to Node 3] G --> H[Node 3 replays and catches up] style A fill:#000000,stroke:#00ff00,stroke-width:2px,color:#fff style B fill:#000000,stroke:#00ff00,stroke-width:2px,color:#fff style C fill:#000000,stroke:#00ff00,stroke-width:2px,color:#fff style D fill:#000000,stroke:#00ff00,stroke-width:2px,color:#fff style E fill:#000000,stroke:#00ff00,stroke-width:2px,color:#fff style F fill:#000000,stroke:#ff0000,stroke-width:2px,color:#fff style G fill:#000000,stroke:#00ff00,stroke-width:2px,color:#fff style H fill:#000000,stroke:#00ff00,stroke-width:2px,color:#fff

The Time Limit#

Hinted handoff has a window. Cassandra defaults to 3 hours. If Node 3 is down longer than that, the hints expire and are discarded. At that point you need anti-entropy repair: comparing data between nodes using Merkle trees to find and fill the gaps. Hinted handoff handles short outages. Repair handles longer ones.

Hint storage itself can become a problem. If a node is down for hours, the coordinator accumulates thousands of hints. This consumes local disk. Cassandra caps hint storage per target node to prevent the coordinator from running out of space.

Sloppy Quorums#

In Dynamo-style systems, hinted handoff pairs with sloppy quorums. During a partition, writes are accepted by nodes that don’t normally own the data, with hints attached. The quorum count is satisfied, the write returns success, and the hint ensures the data reaches the right owner when the partition heals.

At Oracle#

We had a MySQL primary with two replicas. When a replica went down for maintenance, all writes during that period went only to the remaining replica and primary. The recovered replica caught up through binlog replication, which worked fine but was slow for large write volumes. The first long maintenance window (about 4 hours) left the replica significantly behind. Understanding hinted handoff made me see that replication lag during outages is a fundamental problem that wide-column stores address differently.

What I’m Learning#

Hinted handoff is optimistic: it assumes the target node will come back within the hint window. For nodes that stay down longer, it fails silently. Monitoring hint replay rates and hint expiry rates tells you whether your cluster is healthy or quietly falling behind.

Have you monitored hint accumulation in your distributed database, or trusted it to just work?