Architecture March 21, 2026 ⏱ 5 min read

Saga Pattern: Managing Distributed Transactions Without 2PC

Distributed transactions are hard. The Saga pattern breaks them into a sequence of local transactions with compensating actions, avoiding the pitfalls of Two-Phase Commit.

sagadistributed-systemsmicroservicestransactionsarchitecture

In a monolithic application, a database transaction is simple: you either commit everything or roll everything back. But in a microservices architecture, a single business operation can span multiple services, each with its own database. How do you maintain consistency without locking all those databases together?

The answer most people reach for first is Two-Phase Commit (2PC). The answer they should reach for is the Saga pattern.

Why 2PC Falls Apart in Microservices

Two-Phase Commit works by having a coordinator ask all participants to prepare (phase 1), then commit (phase 2). It sounds clean, but it has serious problems at scale:

Blocking: all participants are locked until the coordinator confirms
Single point of failure: if the coordinator crashes mid-transaction, participants are stuck
Tight coupling: all services must speak the same distributed transaction protocol
Poor availability: under network partitions, you can’t proceed

In microservices, you trade some consistency guarantees for availability and autonomy. The Saga pattern is designed for exactly that tradeoff.

What Is the Saga Pattern?

A Saga is a sequence of local transactions. Each step updates one service’s database and publishes an event or message. If a step fails, the saga executes compensating transactions to undo the work done by previous steps.

Think of it like a coordinated series of reversible steps:

Step 1: Create Order      → success → trigger Step 2
Step 2: Reserve Inventory → success → trigger Step 3
Step 3: Charge Payment    → FAILURE → trigger Compensation
Compensation 2: Release Inventory
Compensation 1: Cancel Order

No global lock. No coordinator holding everything hostage. Just events and compensations.

Two Flavors: Choreography vs Orchestration

Choreography

Each service listens for events and reacts accordingly. There’s no central brain.

// OrderService
async function createOrder(orderData: OrderData) {
  const order = await db.orders.create({ ...orderData, status: 'PENDING' });
  await eventBus.publish('order.created', { orderId: order.id, items: order.items });
  return order;
}

// InventoryService listens to 'order.created'
eventBus.on('order.created', async ({ orderId, items }) => {
  const reserved = await reserveItems(items);
  if (reserved) {
    await eventBus.publish('inventory.reserved', { orderId });
  } else {
    await eventBus.publish('inventory.failed', { orderId });
  }
});

// PaymentService listens to 'inventory.reserved'
eventBus.on('inventory.reserved', async ({ orderId }) => {
  const order = await getOrder(orderId);
  const charged = await chargePayment(order.customerId, order.total);
  if (charged) {
    await eventBus.publish('payment.completed', { orderId });
  } else {
    await eventBus.publish('payment.failed', { orderId });
  }
});

// OrderService listens to compensating events
eventBus.on('payment.failed', async ({ orderId }) => {
  await db.orders.update(orderId, { status: 'CANCELLED' });
  await eventBus.publish('order.cancelled', { orderId });
});

// InventoryService listens to 'order.cancelled' to release stock
eventBus.on('order.cancelled', async ({ orderId }) => {
  await releaseReservedItems(orderId);
});

Pros: fully decoupled, no single point of failure
Cons: hard to track overall saga state, logic scattered across services

Orchestration

A central orchestrator tells each service what to do and handles failures.

class OrderSagaOrchestrator {
  async execute(orderData: OrderData): Promise<void> {
    let orderId: string | null = null;
    let inventoryReserved = false;

    try {
      // Step 1
      const order = await orderService.createOrder(orderData);
      orderId = order.id;

      // Step 2
      await inventoryService.reserve(order.items);
      inventoryReserved = true;

      // Step 3
      await paymentService.charge(order.customerId, order.total);

      // All good
      await orderService.confirm(orderId);
    } catch (error) {
      // Compensate in reverse order
      if (inventoryReserved) {
        await inventoryService.release(orderId!);
      }
      if (orderId) {
        await orderService.cancel(orderId);
      }
      throw error;
    }
  }
}

Pros: clear workflow, easy to track state, easier to debug
Cons: orchestrator becomes a dependency, slight coupling

Handling Idempotency

Since sagas rely on messaging (which can deliver events more than once), every step must be idempotent — running it twice should produce the same result.

async function reserveItems(orderId: string, items: Item[]): Promise<void> {
  // Check if already reserved for this order
  const existing = await db.reservations.findOne({ orderId });
  if (existing) {
    return; // Already done, skip
  }

  await db.reservations.create({ orderId, items, createdAt: new Date() });
}

Use unique keys (like orderId) as idempotency keys in your database operations.

Tracking Saga State

For orchestrated sagas, persisting the saga state helps with debugging and recovery:

interface SagaState {
  id: string;
  status: 'STARTED' | 'INVENTORY_RESERVED' | 'PAYMENT_CHARGED' | 'COMPLETED' | 'COMPENSATING' | 'FAILED';
  orderId: string;
  steps: SagaStep[];
  createdAt: Date;
  updatedAt: Date;
}

async function updateSagaStep(sagaId: string, step: string, status: 'done' | 'failed') {
  await db.sagas.update(sagaId, {
    [`steps.${step}`]: status,
    updatedAt: new Date(),
  });
}

When to Use the Saga Pattern

Use sagas when:

A business transaction spans multiple microservices
You need eventual consistency (you can tolerate brief inconsistency)
High availability is more important than immediate consistency

Don’t use sagas when:

You need strict ACID guarantees (stay in a single DB transaction)
The workflow is simple enough to fit in one service
Your team isn’t ready for the complexity overhead

Common Pitfalls

Dirty reads: another process might read intermediate saga state. Guard against this with careful status management (e.g., PENDING vs CONFIRMED).

Long-running sagas: very long sagas increase the window for partial failure. Consider timeouts and explicit failure handling.

Compensation isn’t always possible: some actions (like sending an email) can’t be undone. Design compensations carefully, and accept that “undo” sometimes means “send an apology email”.

Key Takeaways

Sagas replace distributed ACID transactions with a sequence of local transactions + compensations
Choreography is more decoupled but harder to observe; Orchestration is clearer but adds a coordinator
Every step must be idempotent
Persist saga state to enable recovery and debugging
Accept eventual consistency as a design constraint, not a bug