PostgreSQL blocking queries: finding the root blocker in a lock cascade
A query that normally finishes in milliseconds is now running for minutes. pg_stat_activity shows a queue of sessions with wait_event_type = 'Lock'. You identify one session holding the contested lock, but terminating it does not clear the queue. That session was itself blocked by another, which was blocked by another. Until you find the session at the head of the chain, the cascade continues.
Use this guide to traverse the lock graph, distinguish direct blockers from the root blocker, and decide whether to terminate or wait. It applies to self-managed PostgreSQL, RDS, Aurora, and containerized deployments where you have access to pg_stat_activity and pg_locks.
What this means
PostgreSQL queues lock requests per object. When a session cannot acquire a lock, it waits. If that session already holds locks on other objects, it becomes a blocker for those objects, creating a cascade. The session you see in pg_locks with granted = true is often only the direct blocker, not the root cause.
pg_blocking_pids(int) returns the direct blockers for a given backend PID (one hop only). To find the root blocker in a cascade, walk the chain recursively. The root is the session that is not itself waiting for a lock, or is waiting on something other than a lock (such as I/O) while still holding the contested resource.
Common lock types in these incidents include relation (table-level), transactionid (row-level contention under MVCC), virtualxid, and advisory locks. Note that pg_locks does not expose which parallel workers belong to which client session, so cascades involving parallel query workers can be harder to trace.
flowchart TD
A[Client session requests lock] --> B{Lock available?}
B -->|Yes| C[Lock granted]
B -->|No| D[Session waits]
D --> E[Direct blocker holds lock]
E --> F{Blocker is waiting?}
F -->|Yes| G[Intermediate blocker]
F -->|No| H[Root blocker]
G --> E
H --> I{Root type}
I --> J[Active query]
I --> K[Idle in transaction]
I --> L[Prepared transaction]Common causes
| Cause | What it looks like | First thing to check |
|---|---|---|
| Long-running DDL | ALTER TABLE, CREATE INDEX, or VACUUM FULL holding AccessExclusiveLock | pg_locks for mode = 'AccessExclusiveLock' and granted = true |
| Idle in transaction holding locks | state = 'idle in transaction' with an old xact_start and active entries in pg_locks | pg_stat_activity for idle transactions older than 5 minutes |
| Heavy UPDATE/DELETE without index support | Row contention visible as transactionid waits in pg_locks | Root blocker query text in pg_stat_activity |
| Prepared transaction | pg_blocking_pids returns an empty array or the wait involves no running PID | pg_prepared_xacts for stale entries |
| Autovacuum blocked by long query or DDL | Autovacuum worker in pg_stat_activity with wait_event_type = 'Lock' | Whether the root blocker is a long-running query or DDL |
Quick checks
Run these in psql or your monitoring SQL console. They are read-only and safe.
# Show current ungranted locks
psql -c "SELECT pid, locktype, mode, granted, relation::regclass FROM pg_locks WHERE NOT granted LIMIT 5;"
# Show sessions waiting on locks with their direct blockers
psql -c "SELECT pid, wait_event_type, wait_event, pg_blocking_pids(pid) AS blockers, query FROM pg_stat_activity WHERE wait_event_type = 'Lock';"
# Show idle-in-transaction sessions that may be holding locks
psql -c "SELECT pid, usename, state, xact_start, query FROM pg_stat_activity WHERE state = 'idle in transaction' AND xact_start < NOW() - INTERVAL '5 minutes';"
# Show DDL operations holding exclusive locks
psql -c "SELECT pid, mode, relation::regclass, query FROM pg_locks JOIN pg_stat_activity USING (pid) WHERE mode = 'AccessExclusiveLock' AND granted;"
# Check for prepared transactions that cannot be terminated via pg_terminate_backend
psql -c "SELECT gid, prepared, owner FROM pg_prepared_xacts WHERE prepared < NOW() - INTERVAL '1 hour';"
How to diagnose it
Confirm the waits. Query
pg_stat_activitywherewait_event_type = 'Lock'. Record thepid,query, and wait start time. On PostgreSQL 13+,pg_locks.waitstartgives the exact timestamptz the wait began. On earlier versions, derive approximate wait time frompg_stat_activity.query_start.Find direct blockers. Use
pg_blocking_pids(pid)for each waiting backend. PostgreSQL recommends this function over self-joiningpg_locks, which can mislead under heavy lock churn or fast-path lock activity.Walk the chain to the root. If a direct blocker is also waiting on a lock, it is an intermediate node. Repeat until you reach a backend that is not waiting for a lock. A recursive CTE automates the traversal and avoids revisiting the same PID:
WITH RECURSIVE lock_chain AS (
SELECT
pid,
pg_blocking_pids(pid) AS blockers,
query,
0 AS depth,
ARRAY[pid] AS path
FROM pg_stat_activity
WHERE cardinality(pg_blocking_pids(pid)) > 0
UNION ALL
SELECT
a.pid,
pg_blocking_pids(a.pid),
a.query,
lc.depth + 1,
lc.path || a.pid
FROM pg_stat_activity a
JOIN lock_chain lc
ON a.pid = ANY(lc.blockers)
AND NOT a.pid = ANY(lc.path)
)
SELECT * FROM lock_chain ORDER BY depth, pid;
Classify the root blocker. Inspect the root’s
stateinpg_stat_activity.active: running a query. Check whether it is legitimate or runaway.idle in transaction: a zombie session holding a snapshot and locks.- If
pg_blocking_pidsreturns an empty array or the chain ends without a running PID, checkpg_prepared_xacts. A prepared transaction holds locks but does not appear inpg_stat_activitywith a running PID.
Map the lock object. Join
pg_lockswithpg_stat_activityusingpid.locktype = 'relation'withAccessExclusiveLock: DDL is the culprit.locktype = 'transactionid': row-level contention from concurrent UPDATE/DELETE.locktype = 'advisory': application-level locking.
Metrics and signals to monitor
| Signal | Why it matters | Warning sign |
|---|---|---|
pg_stat_activity.wait_event_type = 'Lock' | Indicates active lock waits | Sustained rows for > 30 seconds during peak traffic |
pg_locks.waitstart age (PG 13+) | Exact lock wait duration | Any wait exceeding your query latency SLO |
pg_stat_database.deadlocks | Deadlock detector firing | Increase over a 5-minute window |
log_lock_waits output | Logs blocker and blocked PIDs for waits that exceed deadlock_timeout | Frequent entries in PostgreSQL logs |
pg_stat_activity.state = 'idle in transaction' count | Zombie sessions hold locks and block vacuum | Any session in this state for > 5 minutes |
| Connection utilization | Lock cascades consume the connection budget | Active + waiting connections > 80% of max_connections |
Fixes
Terminate the root blocker
If the root is an idle in transaction zombie or a runaway query, use:
SELECT pg_terminate_backend(<pid>);
This forces rollback, releasing all locks held by that backend. Every waiter in the cascade then retries. Do not terminate if the root is a legitimate long-running batch job or an ALTER TABLE in progress; rollback may take minutes and hold locks until it completes.
Terminate intermediate waiters
If the root blocker is legitimate and must finish, but the cascade is consuming all connections and risking an outage, terminate some of the waiting sessions. This preserves connection headroom but does not restore throughput until the root finishes.
Resolve prepared transactions
A prepared transaction cannot be killed with pg_terminate_backend. If the root is a stale prepared transaction identified in pg_prepared_xacts, commit it if the application logic allows, or roll it back:
ROLLBACK PREPARED '<gid>';
Tune timeouts
Set idle_in_transaction_session_timeout = '2min' to prevent abandoned transactions from holding locks indefinitely. Set lock_timeout = '1s' for application-side DDL or schema migration scripts so they fail fast instead of queuing behind existing locks. Set statement_timeout appropriate to your workload to cap runaway queries.
Prevention
- Enable
log_lock_waits = on. Combined with the defaultdeadlock_timeout = 1s, this logs the blocker PID, blocked PID, lock type, and query text for waits exceeding one second. Essential for post-incident forensics. - Schedule heavy DDL during maintenance windows.
ALTER TABLE,TRUNCATE, andVACUUM FULLacquireAccessExclusiveLockand block all concurrent access. - Index foreign keys. Unindexed foreign keys on child tables can turn parent updates into long-lasting table locks.
- Commit promptly. Do not open a transaction, perform a query, and then wait for an external service before committing. Connection pooling does not fix application-level transaction boundaries.
- Monitor replication slots. An inactive logical replication slot retains WAL and can indirectly pressure the primary, though it does not directly cause lock cascades.
How Netdata helps
Netdata correlates postgres.locks_wait metrics with query duration and connection state charts, showing when lock waits spike relative to active query load. Alerts on sustained lock wait time or a growing count of idle in transaction connections can surface cascades before connection exhaustion.
The pg_stat_activity wait event breakdown shows Lock waits alongside IO and Client waits without manual SQL. Track pg_stat_database.deadlocks rate to flag when contention turns into deadlocks. Use checkpoint and replication context to distinguish lock contention from checkpoint I/O stalls that can look similar in application latency.
Related guides
- How PostgreSQL actually works in production: a mental model for operators
- PostgreSQL connection exhaustion: detection, diagnosis, and prevention
- PostgreSQL connection refused: pg_hba, listen_addresses, and TCP diagnosis
- PostgreSQL deadlock detected: how to diagnose and prevent deadlocks
- PostgreSQL idle in transaction: detecting and killing zombie sessions
- PostgreSQL monitoring checklist: the signals every production database needs
- PostgreSQL monitoring maturity model: from reactive to self-healing
- PgBouncer pool exhausted: how to diagnose and fix client waits
- PgBouncer vs Pgpool-II vs Odyssey: choosing a PostgreSQL connection pooler
- PostgreSQL FATAL: too many connections - causes and fixes






