ZooKeeper Ephemeral Nodes

Redis locks expire after a TTL. If your process crashes, you wait up to 30 seconds for the lock to become available. ZooKeeper takes a different approach: lock it to the session, not a timer.

Ephemeral Nodes#

ZooKeeper has two kinds of nodes: persistent (survive until explicitly deleted) and ephemeral (automatically deleted when the client session expires). A session is kept alive by a heartbeat. If the client crashes, heartbeats stop, the session expires after a configurable timeout, and the ephemeral node vanishes. No TTL math needed.

For distributed locking: create an ephemeral node at /locks/job-name. If creation succeeds, you hold the lock. If it fails (node already exists), someone else does. When the lock holder finishes, it deletes the node explicitly, or if it crashes, ZooKeeper cleans it up automatically.

Watches and the Herd Problem#

ZooKeeper lets clients subscribe to node changes via watches. When the lock holder releases, watchers get notified. Problem: if 50 clients are all watching the same node, all 50 wake up and rush to create it. Only one wins, and 49 requests were wasted. This is the herd effect.

The fix is sequential ephemeral nodes. Each client creates /locks/job-name- with the SEQUENTIAL flag, which appends a monotonically increasing number. Each client then watches only the node immediately before it in the sequence.

graph TD A[Client 1 creates /locks/job-0001] --> B[Client 2 creates /locks/job-0002] B --> C[Client 3 creates /locks/job-0003] A --> D[Client 1 holds lock, watches nothing] B --> E[Client 2 watches job-0001] C --> F[Client 3 watches job-0002] D --> G[Lock released or crash: job-0001 deleted] G --> H[Client 2 notified, acquires lock] style A fill:#000000,stroke:#00ff00,stroke-width:2px,color:#fff style B fill:#000000,stroke:#00ff00,stroke-width:2px,color:#fff style C fill:#000000,stroke:#00ff00,stroke-width:2px,color:#fff style D fill:#000000,stroke:#00ff00,stroke-width:2px,color:#fff style E fill:#000000,stroke:#00ff00,stroke-width:2px,color:#fff style F fill:#000000,stroke:#00ff00,stroke-width:2px,color:#fff style G fill:#000000,stroke:#00ff00,stroke-width:2px,color:#fff style H fill:#000000,stroke:#00ff00,stroke-width:2px,color:#fff

Only one client wakes per release. Strict FIFO ordering. No retry-with-jitter needed.

ZooKeeper vs Redis#

ZooKeeper is CP: it prioritizes consistency over availability. Redis is AP: faster, but weaker guarantees. For distributed locks where correctness matters more than speed, ZooKeeper wins. For most job-coordination scenarios where you mainly want to avoid duplicate work, Redis (or Redlock) is simpler to operate.

At Oracle#

We briefly evaluated ZooKeeper for cluster coordination before choosing Redis for lower operational overhead. The sequential node pattern for fair locking was genuinely elegant compared to what we built ourselves. Redis required custom retry-with-jitter logic and careful TTL tuning. ZooKeeper’s model just handles it. The tradeoff was running and operating a ZooKeeper ensemble versus a Redis cluster we already had. See also Leader Election for how ZooKeeper-style session-based approaches handle split-brain.

What I’m Learning#

Ephemeral nodes are a clean primitive: lifecycle tied to the client, not a wall clock. The sequential node pattern for herd prevention took me a while to appreciate fully.

Do you use ZooKeeper in production, or has the operational complexity pushed you toward Redis-based alternatives?