Saga Pattern: Managing Distributed Transactions

Distributed transactions are a trap.

In a monolith, it’s easy. You wrap everything in a @Transactional block. If the payment fails, the order doesn’t get created. Atomicity is free.

In microservices, you don’t have that luxury. You can’t start a transaction in the Order service and have it magically span across the Payment and Inventory services.

This is where you need a Saga.

What is a Saga?#

A Saga is just a sequence of local transactions. Each step updates its own database and triggers the next step. If a step fails, you have to run “compensating transactions”—basically, an “Undo” button for everything that happened before.

There are two ways I’ve seen people do this:

1. Choreography (The “Event-Driven” Way)#

No one is in charge. Services just listen for events and react.

%%{init: {'theme':'base', 'themeVariables': { 'primaryColor':'#000000','primaryTextColor':'#00ff00','primaryBorderColor':'#00ff00','lineColor':'#00ff00','secondaryColor':'#000000','tertiaryColor':'#000000','noteBkgColor':'#000000','noteBorderColor':'#00ff00','noteTextColor':'#00ff00'}}}%% sequenceDiagram autonumber participant OS as Order Service participant PS as Payment Service participant IS as Inventory Service OS->>OS: Create Order (Pending) OS->>PS: OrderCreated Event PS->>PS: Process Payment alt Success PS->>IS: PaymentProcessed Event IS->>IS: Reserve Inventory IS->>OS: InventoryReserved Event OS->>OS: Approve Order else Failure PS->>OS: PaymentFailed Event OS->>OS: Cancel Order end

My take: This is great for simple flows. It’s decoupled. But it’s a nightmare to debug once you have 5+ services. You start wondering, “Wait, which service was supposed to handle the RefundRequested event?”

2. Orchestration (The “Manager” Way)#

You have a central “Orchestrator” (a state machine) that tells everyone exactly what to do.

%%{init: {'theme':'base', 'themeVariables': { 'primaryColor':'#000000','primaryTextColor':'#00ff00','primaryBorderColor':'#00ff00','lineColor':'#00ff00','secondaryColor':'#000000','tertiaryColor':'#000000','noteBorderColor':'#00ff00','noteTextColor':'#00ff00'}}}%% sequenceDiagram autonumber participant OS as Order Service participant SO as Saga Orchestrator participant PS as Payment Service participant IS as Inventory Service OS->>SO: Start Order Saga SO->>PS: Charge Card PS-->>SO: Payment Success SO->>IS: Reserve Items alt Items available IS-->>SO: Items Reserved SO->>OS: Mark Order as Paid else Out of stock IS-->>SO: Failed SO->>PS: Refund Card (Undo) SO->>OS: Cancel Order end

My take: Much easier to reason about. You have one place to look to see the state of an order. The downside? The Orchestrator can become a “God Object” that knows too much about everyone else.

The Isolation Problem#

Sagas are eventually consistent. They lack the “I” in ACID (Isolation).

This means a user might see their order as “Pending” or even “Confirmed” before the inventory is actually reserved. You have to design your UI to handle this—lots of “Processing…” spinners and status updates.

What I’m Learning#

I used to try to force distributed transactions using 2PC (Two-Phase Commit). It was slow and brittle. Sagas are harder to code but much more resilient.

The biggest hurdle isn’t the code; it’s the “Undo” logic. How do you “undo” an email that was already sent? You can’t. You have to send a second email saying “Oops, please ignore that.”

Designing for failure is a different mindset than designing for the “happy path.”

Have you tried building a saga? Did you go with Choreography or Orchestration?