$ guides / postgres / postgres-alter-table-blocked ▌

Operations Guides

PostgreSQL ALTER TABLE blocked: zero-downtime DDL patterns

An ALTER TABLE to add a column or change a type hangs. Application queries time out. Connection pools saturate. What looked like a simple schema change becomes a production incident.

By default, ALTER TABLE acquires an ACCESS EXCLUSIVE lock. It conflicts with every other lock mode, including ACCESS SHARE held by a plain SELECT. Once the DDL statement queues behind a blocker, every subsequent read and write on that table queues behind the waiting DDL. The table goes offline before the ALTER TABLE executes any work.

The outage lasts until the blocker releases its lock plus the time for the DDL to complete. If the blocker is idle in transaction and has been open for hours, the outage extends until the session is terminated or times out.

flowchart TD
    A[Running query holds AccessShareLock] --> B[ALTER TABLE waiting for AccessExclusiveLock]
    B --> C[New queries queue here]
    B --> D[Application timeouts spike]
    E[AccessExclusiveLock blocks all other locks] -.-> A

Common causes

Cause	What it looks like	First thing to check
Long-running query or transaction holding `ACCESS SHARE`	`ALTER TABLE` hangs immediately; application queries were running fine before the DDL	`pg_stat_activity` for old `query_start` or `xact_start` on the target relation
`idle in transaction` session holding a relation lock	No active query visible, but `pg_locks` shows granted locks on the table; `state = 'idle in transaction'`	`pg_stat_activity.state_change` age for idle sessions
DDL already queued and causing a lock cascade	Multiple sessions in `wait_event_type = 'Lock'` on the same relation; normally fast queries are stalled	`pg_locks` for ungranted `AccessExclusiveLock` and queue depth
Unsafe DDL pattern that could use a weaker-lock alternative	`CREATE INDEX` without `CONCURRENTLY`, `ADD FOREIGN KEY` without `NOT VALID`, or `SET NOT NULL` directly on a large table	The DDL statement itself for missing `CONCURRENTLY`, `NOT VALID`, or `VALIDATE` keywords

Quick checks

Run these in order. The first three are read-only.

-- Check 1: Find ungranted locks and the queries waiting for them
SELECT pid, mode, granted, relation::regclass, query_start, query
FROM pg_locks
JOIN pg_stat_activity USING (pid)
WHERE NOT granted
ORDER BY query_start;

-- Check 2: Identify the blocking PIDs behind Lock waits
SELECT DISTINCT unnest(pg_blocking_pids(pid)) AS blocker_pid
FROM pg_stat_activity
WHERE wait_event_type = 'Lock';

-- Check 3: Details of the blocking sessions
SELECT pid, usename, state, query_start, xact_start, query
FROM pg_stat_activity
WHERE pid = ANY(
  ARRAY(SELECT DISTINCT unnest(pg_blocking_pids(pid)) FROM pg_stat_activity WHERE wait_event_type = 'Lock')
);

-- Check 4: Lock state for a specific table
SELECT mode, granted, pid, query
FROM pg_locks
JOIN pg_stat_activity USING (pid)
WHERE relation = 'my_table'::regclass;

-- Check 5: Current lock timeout (0 means the DDL can wait forever)
SHOW lock_timeout;

How to diagnose it

Confirm the DDL is waiting. Query pg_stat_activity for the ALTER TABLE session. Look for wait_event_type = 'Lock' and a query_start older than the command should take.
Find the blocking session. Use pg_blocking_pids() or join pg_locks with pg_stat_activity. Note the blocker’s state: active, idle in transaction, or idle.
Check transaction age. If xact_start is old but query_start is recent, the transaction has been open a long time and may be holding locks despite appearing idle.
Assess queue depth. Count how many sessions are waiting on the same relation. A long queue means a full lock cascade is in progress.
Decide on intervention. If the blocker is idle in transaction with no valuable work, terminating it is usually safe. If it is a long-running report or backup query, weigh the cost of canceling against the cost of the table remaining locked.
Break the cascade. If the DDL has not yet been granted its lock, cancel the DDL itself to immediately free the queue. Retry later with a safer pattern or a short lock_timeout.

Metrics and signals to monitor

Signal	Why it matters	Warning sign
`pg_stat_activity` with `wait_event_type = 'Lock'` on the target relation	Direct indicator of lock contention	Sustained `Lock` wait > 30 seconds
`pg_locks.granted = false` with `mode = AccessExclusiveLock`	Shows DDL is queued and blocking subsequent traffic	Any ungranted `AccessExclusiveLock` on a production table
`pg_stat_activity.state = 'idle in transaction'` with old `state_change`	Idle transactions hold locks and block vacuum and DDL	Sessions idle in transaction > 5 minutes
`pg_stat_database.deadlocks`	Deadlocks can form when lock queues interact with concurrent updates	Incrementing deadlocks during DDL windows
Query latency p99 on the table	Lock cascades manifest as application timeouts	p99 latency > 10x baseline during schema changes

Fixes

Terminate the blocker

If the blocker is idle in transaction and has been idle longer than your application tolerance, terminate it. This kills that session and breaks the logjam.

-- Terminate an idle-in-transaction blocker
SELECT pg_terminate_backend(pid)
FROM pg_stat_activity
WHERE pid = <blocker_pid>
  AND state = 'idle in transaction';

If the blocker is an active long-running query, try pg_cancel_backend first. Use pg_terminate_backend only if cancel does not respond.

Cancel the queued DDL to stop the cascade

If the ALTER TABLE is still waiting, cancel it. The lock queue behind it will clear, and normal traffic resumes. Retry the operation during a quieter window or with a safer pattern.

-- Cancel the waiting DDL
SELECT pg_cancel_backend(pid)
FROM pg_stat_activity
WHERE query LIKE '%ALTER TABLE%'
  AND wait_event_type = 'Lock';

Use lock_timeout to fail fast

Before running any DDL, set a short lock_timeout so the statement aborts rather than camping in the queue.

-- Fail DDL if lock is not available in 2 seconds
SET LOCAL lock_timeout = '2s';
ALTER TABLE ...

When the timeout fires, PostgreSQL returns an error and releases the queue position. Retry with exponential backoff from your migration tool.

Replace plain CREATE INDEX with CONCURRENTLY

Plain CREATE INDEX acquires SHARE lock, which blocks writes. CREATE INDEX CONCURRENTLY uses SHARE UPDATE EXCLUSIVE, allowing reads and writes to continue. It takes longer and uses more CPU, but it does not block traffic. It cannot run inside a transaction block.

CREATE INDEX CONCURRENTLY idx_name ON table_name(column_name);

Split constraint addition into NOT VALID and VALIDATE

Adding a CHECK or FOREIGN KEY constraint directly performs a full table scan under ACCESS EXCLUSIVE. Split the operation:

-- Step 1: Add without validating (brief lock, metadata-only change)
ALTER TABLE orders ADD CONSTRAINT check_positive CHECK (amount > 0) NOT VALID;

-- Step 2: Validate in a separate statement (weaker lock, allows reads and writes)
ALTER TABLE orders VALIDATE CONSTRAINT check_positive;

The VALIDATE step acquires SHARE UPDATE EXCLUSIVE, not ACCESS EXCLUSIVE.

Use expand-contract for hazardous changes

For column type changes, renames, or backfill requirements, use the expand-contract pattern. Add the new column or table alongside the old one, backfill asynchronously, switch application logic, then drop the old artifact. This avoids table rewrites and long-lived exclusive locks.

Use online rewrite tools for large tables

When PostgreSQL’s built-in options are insufficient, use pg_repack for online table reorganization. It builds a shadow table, replays changes, and swaps the tables with only a brief ACCESS EXCLUSIVE lock at the end. It requires a primary key or unique NOT NULL index and approximately 2x disk space. PgOnlineSchemaChange (pg-osc) is another option for minimal-locking schema changes on large tables.

Prevention

Review DDL in CI. Flag any migration that runs ALTER TABLE without CONCURRENTLY, NOT VALID, or VALIDATE.
Set lock_timeout by default. Never run production DDL with an indefinite lock wait. A 1-5 second timeout turns a queue camp into a fast retry.
Schedule heavy DDL in maintenance windows. Run risky changes when long-running reports and analytic queries are not active.
Prevent idle-in-transaction blockers. Set idle_in_transaction_session_timeout to a reasonable value so zombie sessions cannot hold locks for hours.
Maintain a runbook. Document which application PIDs and users are safe to terminate, and who must approve a blocker kill during business hours.

How Netdata helps

Netdata collects pg_stat_activity state breakdowns, including Lock waits and idle in transaction sessions, so you can spot blocker patterns without querying the system catalogs.
Per-table query latency percentiles help confirm whether a migration caused a traffic stall and whether recovery worked.
Correlating pg_stat_database.deadlocks with deployment timestamps exposes schema changes that introduced contention.
Connection state charts show the moment a DDL queue consumes available connection headroom, before the application returns FATAL: too many connections errors.

The Netdata solution

PostgreSQL monitoring with Netdata

Netdata monitors PostgreSQL with per-second metrics, pre-built dashboards, and ML-powered anomaly detection. Correlate connection saturation, lock waits, autovacuum progress, replication lag, and checkpoint I/O against the rest of your stack so you catch the incidents in these runbooks before they page anyone.

See PostgreSQL monitoring → Start monitoring free

PostgreSQL ALTER TABLE blocked: zero-downtime DDL patterns

PostgreSQL ALTER TABLE blocked: zero-downtime DDL patterns

Common causes

Quick checks

How to diagnose it

Metrics and signals to monitor

Fixes

Terminate the blocker

Cancel the queued DDL to stop the cascade

Use lock_timeout to fail fast

Replace plain CREATE INDEX with CONCURRENTLY

Split constraint addition into NOT VALID and VALIDATE

Use expand-contract for hazardous changes

Use online rewrite tools for large tables

Prevention

How Netdata helps

Related guides

PostgreSQL monitoring with Netdata