PostgreSQL dead tuples piling up: why autovacuum can’t keep up
Table sizes grow faster than insert rates, and queries that scanned thousands of rows last week now scan millions. pg_stat_user_tables shows n_dead_tup climbing into the millions while last_autovacuum is stale or absent. Autovacuum is running somewhere in the cluster, but it is losing the race.
PostgreSQL creates dead tuples on every UPDATE and DELETE. Autovacuum reclaims them only when the dead tuple count crosses a threshold. On large or high-churn tables, the default threshold is far too conservative, and even when vacuum does fire, long-running transactions, worker starvation, or streaming replica feedback can prevent tuple removal. The result is bloat: wasted space, slower scans, and eventually transaction ID wraparound risk.
What this means
Dead tuples are row versions no longer visible to any transaction. They remain until VACUUM marks the space reusable. Autovacuum compares n_dead_tup against a per-table threshold: threshold + scale_factor * reltuples. The global defaults are autovacuum_vacuum_threshold = 50 and autovacuum_vacuum_scale_factor = 0.2. On a 100-million-row table, vacuum does not trigger until roughly 20 million tuples are dead.
Even when autovacuum starts, it can fail to reduce n_dead_tup. A long-running transaction holds an MVCC snapshot that prevents VACUUM from removing tuples newer than the snapshot. On a streaming replica, hot_standby_feedback = on tells the primary to retain dead tuples the replica might still need. With only autovacuum_max_workers = 3 by default, many tables hitting their thresholds simultaneously can queue high-churn tables behind cold ones.
On PostgreSQL 16 and earlier, autovacuum tracks dead tuple TIDs in a memory array capped by autovacuum_work_mem. When that array fills, the worker rescans the heap for additional index vacuum phases. PostgreSQL 17 replaces this with a TIDStore backed by an Adaptive Radix Tree, cutting memory use and collapsing multiple index phases into one.
flowchart TD
A[n_dead_tup growing] --> B[Check last_autovacuum
per table]
B -->|Stale or absent| C[Check scale_factor
vs table size]
B -->|Recent but n_dead_tup
still growing| D[Check blockers]
D --> E[Long-running xacts?]
D --> F[hot_standby_feedback?]
D --> G[Workers saturated?]
E --> H[Terminate or fix app]
F --> I[Disable feedback or
tune replica queries]
G --> J[Raise workers or
tune per-table]
C --> K[Lower scale_factor
or threshold]Common causes
| Cause | What it looks like | First thing to check |
|---|---|---|
| Scale factor too high for large tables | n_dead_tup stable in the millions; last_autovacuum is days old | Table-level autovacuum_vacuum_scale_factor |
| Long-running transaction blocking vacuum | n_dead_tup grows despite active autovacuum workers; vacuum completes but count does not drop | pg_stat_activity for xact_start older than vacuum cycles |
| Worker starvation | Many tables exceed thresholds; small tables get vacuumed while hot tables wait | SHOW autovacuum_max_workers; worker occupancy |
hot_standby_feedback on replica | Primary bloats with modest write rates; replica runs long reports | pg_replication_slots and hot_standby_feedback setting |
Insufficient work_mem for dead tuple tracking (pre-PG17) | Workers active but n_dead_tup keeps climbing; repeated index vacuum phases | autovacuum_work_mem or maintenance_work_mem |
Quick checks
# Top tables by dead tuple count
psql -c "SELECT schemaname, relname, n_dead_tup, n_live_tup,
round(n_dead_tup::numeric / NULLIF(n_live_tup + n_dead_tup, 0), 2) AS dead_ratio,
last_autovacuum
FROM pg_stat_user_tables
ORDER BY n_dead_tup DESC LIMIT 10;"
# Sessions that may be blocking vacuum
psql -c "SELECT pid, usename, state, xact_start, state_change, query
FROM pg_stat_activity
WHERE xact_start < NOW() - INTERVAL '5 minutes'
OR (state = 'idle in transaction' AND state_change < NOW() - INTERVAL '5 minutes')
ORDER BY xact_start;"
# Currently running autovacuum workers
psql -c "SELECT pid, query, backend_start, state
FROM pg_stat_activity
WHERE query LIKE 'autovacuum:%';"
# Wraparound age per database
psql -c "SELECT datname, age(datfrozenxid) FROM pg_database ORDER BY 2 DESC;"
# Replication slot status on the primary
psql -c "SELECT slot_name, active, restart_lsn FROM pg_replication_slots;"
# Current autovacuum configuration
psql -c "SHOW autovacuum_max_workers;"
psql -c "SHOW autovacuum_vacuum_scale_factor;"
psql -c "SHOW autovacuum_work_mem;"
How to diagnose it
Quantify bloat. Run the dead tuple ratio query. Any table with a ratio above 0.20 and growing is actively bloating. Compare
n_dead_tupton_tup_updandn_tup_delto confirm write rate.Check if vacuum reaches the table. Look at
last_autovacuuminpg_stat_user_tables. If it is recent butn_dead_tupdoes not drop, vacuum fires but cannot reclaim tuples. If it is stale, the table may not have hit its threshold or workers are starved.Find blockers. Query
pg_stat_activityfor sessions with oldxact_start, especiallystate = 'idle in transaction'. These pin snapshots that block dead tuple removal. Checkpg_replication_slotson the primary for inactive slots, and verify whetherhot_standby_feedbackis enabled on replicas running long queries.Check worker saturation. If
pg_stat_activityshows autovacuum workers constantly busy andlast_autovacuumis stale on hot tables, the defaultautovacuum_max_workers = 3may be too low.Check memory pressure on PostgreSQL 16 and earlier. If a table goes through multiple index vacuum phases in
pg_stat_progress_vacuum,autovacuum_work_memmay be too small to hold all dead tuple TIDs without rescanning the heap.Correlate primary bloat with replicas. If the primary bloats but write volume is moderate, check replica query patterns.
hot_standby_feedback = ontrades primary space for replica query stability.
Metrics and signals to monitor
| Signal | Why it matters | Warning sign |
|---|---|---|
Dead tuple ratio (n_dead_tup / (n_live_tup + n_dead_tup)) | Direct measure of bloat pressure | > 0.20 sustained |
last_autovacuum per table | Confirms autovacuum is reaching the table | Stale (> 24h) on write-heavy tables |
age(datfrozenxid) | Freeze progress; vacuum failure leads to wraparound | > 500 million |
pg_stat_activity idle in transaction age | Blockers preventing dead tuple removal | Any session > 5 minutes |
Active autovacuum workers vs autovacuum_max_workers | Worker starvation | All workers busy continuously |
Fixes
Tune thresholds for high-churn tables
Do not rely on the global 20% scale factor for large or hot tables. Set per-table overrides:
ALTER TABLE events SET (
autovacuum_vacuum_scale_factor = 0.01,
autovacuum_vacuum_threshold = 100
);
This triggers vacuum after roughly 1% of the table changes instead of 20%. For insert-only tables, use autovacuum_vacuum_insert_scale_factor (PostgreSQL 13+).
Tradeoff: More frequent vacuum consumes more I/O and CPU. Raise vacuum_cost_delay to throttle if the cluster becomes I/O saturated.
Remove blockers
WARNING: Terminating backends aborts in-flight work and can disrupt applications. Coordinate with application teams before running this in production.
Terminate idle-in-transaction sessions that are pinning snapshots:
SELECT pg_terminate_backend(pid)
FROM pg_stat_activity
WHERE state = 'idle in transaction'
AND state_change < NOW() - INTERVAL '5 minutes';
Set a preventive limit:
ALTER SYSTEM SET idle_in_transaction_session_timeout = '2min';
SELECT pg_reload_conf();
If hot_standby_feedback is causing primary bloat, evaluate whether the replica’s long queries can be moved to a different replica, batched into smaller windows, or run with feedback disabled. Disabling feedback reduces bloat at the cost of more query cancels on the replica.
Tradeoff: Killing transactions rolls back application work. Coordinate with application teams before terminating.
Increase worker and memory headroom
Raise the global worker pool when many tables compete:
ALTER SYSTEM SET autovacuum_max_workers = 6;
Restart PostgreSQL to apply autovacuum_max_workers.
For PostgreSQL 16 and earlier, raise autovacuum_work_mem so a single worker can track more dead tuples without rescanning:
ALTER SYSTEM SET autovacuum_work_mem = '1GB';
SELECT pg_reload_conf();
On PostgreSQL 17, the TIDStore uses less memory for the same workload, but aggressive tables still benefit from higher limits.
Tradeoff: More workers and memory increase resource competition with foreground queries. Monitor CPU and I/O saturation after changes.
Reclaim existing bloat online
If bloat is already severe, regular VACUUM will not shrink the table. Use pg_repack for online reorganization:
pg_repack -d mydb -t events
This requires a primary key or unique NOT NULL index and roughly twice the table size in free disk space. It does not hold an exclusive lock for the duration.
Avoid VACUUM FULL in production. It acquires an ACCESS EXCLUSIVE lock and blocks all access for the entire rewrite.
Prevention
- Tune per-table autovacuum settings on all high-churn and large tables before bloat appears.
- Partition large time-series tables. Parallel vacuum runs per partition and keeps table sizes manageable.
- Set
fillfactorbelow 100 on update-heavy tables to enable HOT updates, which reduce index churn and dead tuple creation. - Monitor dead tuple ratio and wraparound age proactively. Alert at 10% dead ratio and 500 million XID age.
- Set
idle_in_transaction_session_timeoutglobally to prevent abandoned transactions from blocking vacuum. - Test backup restores and keep
pg_repackavailable. Bloat recovery is an operational event.
How Netdata helps
- Charts
n_dead_tupper table alongside query latency and sequential scan rates to show bloat impact. - Alerts on dead tuple ratio and
age(datfrozenxid)before wraparound becomes critical. - Tracks active autovacuum workers against
autovacuum_max_workersto detect saturation. - Shows
idle in transactionage and replication slot lag with dead tuple growth to identify blockers. - Correlates primary table bloat with replica lag and
hot_standby_feedbackstatus to diagnose cross-instance pressure.
Related guides
- How PostgreSQL actually works in production: a mental model for operators: /guides/postgres/how-postgres-works-in-production/
- PostgreSQL ALTER TABLE blocked: zero-downtime DDL patterns: /guides/postgres/postgres-alter-table-blocked/
- PostgreSQL autovacuum not running: detection, causes, and fixes: /guides/postgres/postgres-autovacuum-not-running/
- PostgreSQL blocking queries: finding the root blocker in a lock cascade: /guides/postgres/postgres-blocking-queries/
- PostgreSQL connection exhaustion: detection, diagnosis, and prevention: /guides/postgres/postgres-connection-exhaustion/
- PostgreSQL connection refused: pg_hba, listen_addresses, and TCP diagnosis: /guides/postgres/postgres-connection-refused/
- PostgreSQL deadlock detected: how to diagnose and prevent deadlocks: /guides/postgres/postgres-deadlock-detected/
- PostgreSQL idle in transaction: detecting and killing zombie sessions: /guides/postgres/postgres-idle-in-transaction/
- PostgreSQL ERROR: could not obtain lock — diagnosis and recovery: /guides/postgres/postgres-lock-not-available/
- PostgreSQL monitoring checklist: the signals every production database needs: /guides/postgres/postgres-monitoring-checklist/
- PostgreSQL monitoring maturity model: from reactive to self-healing: /guides/postgres/postgres-monitoring-maturity-model/
- PgBouncer pool exhausted: how to diagnose and fix client waits: /guides/postgres/postgres-pgbouncer-pool-exhausted/






