Exactly-Once Delivery is a Lie

Kafka says “exactly-once.” Your vendor promises “exactly-once.” The conference talk slides say “exactly-once.”

Your consumer just processed the same message twice. Customer got two emails. Order shipped twice.

The Three Guarantees#

At-most-once: Fire and forget. Message might get lost.

At-least-once: Retry until acknowledged. Might deliver duplicates.

Exactly-once: Delivered exactly one time. The holy grail.

Here’s the thing: exactly-once works within a system. It’s impossible across systems.

The Boundary Problem#

When Kafka says “exactly-once,” they mean within Kafka. Producer sends, Kafka deduplicates, consumer reads. Inside Kafka’s boundary, the message exists once.

But your consumer does things with messages:

@KafkaListener(topics = "orders")
public void processOrder(OrderEvent event) {
    orderService.create(event.getOrder());    // Write to DB
    emailService.sendConfirmation(event);      // Send email
}

Kafka can’t guarantee your database write happened once. Or that the email sent once.

%%{init: {'theme':'base', 'themeVariables': { 'primaryColor':'#000000','primaryTextColor':'#00ff00','primaryBorderColor':'#00ff00','lineColor':'#00ff00','secondaryColor':'#000000','tertiaryColor':'#000000','noteBkgColor':'#000000','noteBorderColor':'#00ff00','noteTextColor':'#00ff00'}}}%% sequenceDiagram autonumber participant K as Kafka participant C as Consumer participant DB as Database K->>C: Message (order-123) C->>DB: Insert order DB-->>C: Success C--XK: Commit offset (crash!) Note over K: Offset not committed K->>C: Redeliver (order-123) C->>DB: Insert order (duplicate!)

Consumer saved the order, then crashed before committing the offset. Kafka redelivers. Consumer processes again. Duplicate order.

Kafka kept its promise. Exactly-once within Kafka. But your system? Processed twice.

This is the boundary problem. Exactly-once requires atomicity. Either everything happens, or nothing does. Within a single database, transactions give you that. Within Kafka, its internal protocol gives you that. But there’s no distributed transaction spanning Kafka, your MySQL database, and your email provider. The Two Generals Problem tells us why: you can’t have perfect coordination across unreliable networks. Every system boundary is a failure point.

What I’m Learning#

I used to think exactly-once was the goal. Then I debugged a production issue at Oracle where orders duplicated despite “exactly-once” being configured. The problem wasn’t Kafka. It was the database boundary.

The shift: stop preventing duplicates. Start handling them.

Have you ever had duplicate processing bite you in production?