Every application evolves. Columns get added, tables get renamed, relationships change. The question isn’t whether you’ll need database migrations — it’s whether they’ll be smooth or catastrophic.

The naive approach — shut down the app, run migrations, restart — works for hobby projects. In production, it means downtime, angry users, and 3 AM deployment windows. There’s a better way.

The Fundamental Problem

In a running system, your database schema and application code must agree. When you change one, the other breaks — unless you’re careful about the order.

Time 0: Old App ↔ Old Schema   ✅ Compatible
Time 1: Old App ↔ New Schema   ❌ Old app crashes (missing column? new constraint?)
Time 2: New App ↔ New Schema   ✅ Compatible

The gap between Time 1 and Time 2 is your downtime window. The goal is to eliminate it entirely.

The Expand/Contract Pattern

The key strategy is expand/contract (also called parallel change). Instead of one breaking change, you make three safe changes:

  1. Expand: Add the new structure alongside the old one
  2. Migrate: Copy data, update code to use both
  3. Contract: Remove the old structure

Each step is backward-compatible. At no point does anything break.

Example: Renaming a Column

You need to rename users.name to users.full_name. Here’s the wrong way:

-- ❌ BREAKING: Every query using 'name' explodes instantly
ALTER TABLE users RENAME COLUMN name TO full_name;

Here’s the expand/contract way:

Step 1: Expand — Add the new column

-- Safe: adding a column doesn't break existing queries
ALTER TABLE users ADD COLUMN full_name VARCHAR(255);

-- Backfill existing data
UPDATE users SET full_name = name WHERE full_name IS NULL;

Step 2: Dual-write — Update application code

// Write to BOTH columns during the transition
async function updateUserName(userId: string, newName: string): Promise<void> {
  await db.query(
    "UPDATE users SET name = $1, full_name = $1 WHERE id = $2",
    [newName, userId]
  );
}

// Read from the NEW column, fall back to old
async function getUserName(userId: string): Promise<string> {
  const result = await db.query(
    "SELECT COALESCE(full_name, name) as display_name FROM users WHERE id = $1",
    [userId]
  );
  return result.rows[0].display_name;
}

Deploy this code. Both old and new instances can run safely.

Step 3: Contract — Remove the old column

Once all application instances use full_name:

-- Safe: no code references 'name' anymore
ALTER TABLE users DROP COLUMN name;

Three deployments instead of one. Zero downtime.

Common Migration Patterns

Adding a Required Column

You can’t add a NOT NULL column without a default — existing rows would violate the constraint.

-- ❌ BREAKING: fails if the table has any rows
ALTER TABLE orders ADD COLUMN status VARCHAR(50) NOT NULL;

-- ✅ SAFE: add with a default, make NOT NULL later
ALTER TABLE orders ADD COLUMN status VARCHAR(50) DEFAULT 'pending';

-- After backfilling all rows:
ALTER TABLE orders ALTER COLUMN status SET NOT NULL;

Splitting a Table

When a table grows too large (user data + preferences + settings), split it gradually:

-- Step 1: Create the new table
CREATE TABLE user_preferences (
  user_id UUID PRIMARY KEY REFERENCES users(id),
  theme VARCHAR(50) DEFAULT 'light',
  language VARCHAR(10) DEFAULT 'en',
  notifications_enabled BOOLEAN DEFAULT true
);

-- Step 2: Backfill from the original table
INSERT INTO user_preferences (user_id, theme, language, notifications_enabled)
SELECT id, theme, language, notifications_enabled FROM users;
// Step 3: Dual-read during transition
async function getUserPreferences(userId: string): Promise<Preferences> {
  // Try new table first
  const prefs = await db.query(
    "SELECT * FROM user_preferences WHERE user_id = $1",
    [userId]
  );

  if (prefs.rows.length > 0) {
    return prefs.rows[0];
  }

  // Fall back to old table (for any missed rows)
  const user = await db.query(
    "SELECT theme, language, notifications_enabled FROM users WHERE id = $1",
    [userId]
  );
  return user.rows[0];
}
-- Step 4: After full migration, drop columns from original table
ALTER TABLE users DROP COLUMN theme;
ALTER TABLE users DROP COLUMN language;
ALTER TABLE users DROP COLUMN notifications_enabled;

Changing a Column Type

Converting price from INTEGER (cents) to DECIMAL (dollars):

-- Step 1: Add new column
ALTER TABLE products ADD COLUMN price_decimal DECIMAL(10, 2);

-- Step 2: Backfill
UPDATE products SET price_decimal = price_cents / 100.0;
// Step 3: Dual-write in application
async function updatePrice(productId: string, priceInDollars: number): Promise<void> {
  await db.query(
    "UPDATE products SET price_cents = $1, price_decimal = $2 WHERE id = $3",
    [Math.round(priceInDollars * 100), priceInDollars, productId]
  );
}

// Step 4: Read from new column
async function getPrice(productId: string): Promise<number> {
  const result = await db.query(
    "SELECT COALESCE(price_decimal, price_cents / 100.0) as price FROM products WHERE id = $1",
    [productId]
  );
  return result.rows[0].price;
}
-- Step 5: Drop old column, rename new
ALTER TABLE products DROP COLUMN price_cents;
ALTER TABLE products RENAME COLUMN price_decimal TO price;

Migration Tooling

Versioned Migrations (Sequential)

Most frameworks use numbered migrations that run in order:

migrations/
├── 001_create_users.sql
├── 002_add_email_to_users.sql
├── 003_create_orders.sql
├── 004_add_status_to_orders.sql
└── 005_create_user_preferences.sql
// TypeORM migration example
import { MigrationInterface, QueryRunner } from "typeorm";

export class AddStatusToOrders1709123456 implements MigrationInterface {
  async up(queryRunner: QueryRunner): Promise<void> {
    await queryRunner.query(
      `ALTER TABLE orders ADD COLUMN status VARCHAR(50) DEFAULT 'pending'`
    );
  }

  async down(queryRunner: QueryRunner): Promise<void> {
    await queryRunner.query(
      `ALTER TABLE orders DROP COLUMN status`
    );
  }
}

Python with Alembic

# alembic/versions/abc123_add_status_to_orders.py
from alembic import op
import sqlalchemy as sa

revision = "abc123"
down_revision = "xyz789"

def upgrade() -> None:
    op.add_column(
        "orders",
        sa.Column("status", sa.String(50), server_default="pending"),
    )

def downgrade() -> None:
    op.drop_column("orders", "status")

Migration Safety Checklist

Before running any migration in production, verify:

Safe Operations (No Lock, No Downtime)

  • ✅ Adding a nullable column
  • ✅ Adding a column with a default (Postgres 11+, MySQL 8.0.12+)
  • ✅ Creating a new table
  • ✅ Creating an index concurrently (CREATE INDEX CONCURRENTLY)
  • ✅ Adding a new enum value (Postgres)

Dangerous Operations (May Lock, May Break)

  • ⚠️ Adding a NOT NULL constraint (needs table scan)
  • ⚠️ Changing a column type (rewrites the table)
  • ⚠️ Adding a unique constraint (needs table scan)
  • ⚠️ Renaming a column (breaks existing queries)
  • ⚠️ Creating a regular index on a large table (locks writes)

Destructive Operations (Data Loss Risk)

  • 🚫 Dropping a column
  • 🚫 Dropping a table
  • 🚫 Truncating a table
  • 🚫 Changing column constraints that reject existing data

Index Migrations

Adding an index to a large table can lock writes for minutes. Always use concurrent index creation:

-- ❌ Locks the entire table until the index is built
CREATE INDEX idx_orders_status ON orders(status);

-- ✅ Builds the index without blocking writes
CREATE INDEX CONCURRENTLY idx_orders_status ON orders(status);

The trade-off: concurrent index creation is slower and can fail if there are constraint violations. Always check that it completed:

-- Check for invalid indexes (failed concurrent builds)
SELECT indexrelid::regclass, indisvalid
FROM pg_index
WHERE NOT indisvalid;

When Things Go Wrong: Rollback Strategies

Forward-Only Migrations

Some teams only write up migrations, never down. The reasoning: rolling back a migration that deleted data doesn’t restore the data. Instead, fix forward:

// If migration 005 broke something, don't rollback to 004.
// Write migration 006 that fixes the problem.
// This is safer because:
// 1. No data loss from partial rollbacks
// 2. The fix is versioned and testable
// 3. You can't accidentally rollback past other migrations

Feature Flags for Schema Changes

For complex migrations, use feature flags to control which schema your code uses:

async function getUser(userId: string): Promise<User> {
  if (featureFlags.isEnabled("use_new_user_schema")) {
    return getUserFromNewSchema(userId);
  }
  return getUserFromOldSchema(userId);
}

This lets you deploy the migration independently from the code change and roll back instantly by flipping the flag.

When to Use Each Strategy

Simple column addition? Just add it with a default. One migration, one deploy.

Renaming or restructuring? Use expand/contract. Three deploys, zero downtime.

Splitting or merging tables? Use expand/contract with feature flags. Test thoroughly in staging.

Dropping unused tables/columns? Wait at least 2 weeks after removing all code references. Then drop. Keep a backup.

The overhead of expand/contract feels unnecessary on small projects. But the first time you avoid a 2 AM downtime window because your migration was backward-compatible, you’ll never go back to the naive approach. Safe migrations are a habit — build it early.