PostgreSQL autovacuum tuning: per-table thresholds for high-churn workloads
Default autovacuum settings target modest OLTP workloads. On a 500-million-row table, the global autovacuum_vacuum_scale_factor of 0.2 means autovacuum ignores the table until dead tuples exceed twenty percent of the row count. For high-churn tables, that delay lets bloat accumulate, degrades indexes, slows scans, and pushes transaction ID age toward wraparound. PostgreSQL overrides these thresholds per table via storage parameters (reloptions), so a 50 GB hot table can run aggressive settings without saturating workers across a 500 MB reference table. This guide covers calculating overrides, sizing worker memory, and verifying vacuum keeps up without drowning the cluster in background I/O.
Prerequisites
- PostgreSQL instance with access to
pg_stat_user_tablesandALTER TABLE ... SET (...). - Superuser or table owner on target tables.
- Disk I/O headroom for more frequent vacuum work. If storage is already saturated, raise
autovacuum_vacuum_cost_delayor increase the scale factor before you lower thresholds. - Table-level statistics source:
pg_stat_user_tablesor a metrics exporter.
Procedure
Identify candidate tables
Query
pg_stat_user_tablesfor tables with high dead-tuple ratios or high absolute dead-tuple counts. Focus on large tables with frequent UPDATE or DELETE.SELECT schemaname, relname, n_live_tup, n_dead_tup, ROUND(n_dead_tup::numeric / NULLIF(n_live_tup + n_dead_tup, 0), 3) AS dead_ratio FROM pg_stat_user_tables WHERE n_dead_tup > 10000 ORDER BY n_dead_tup DESC LIMIT 10;A sustained dead-tuple ratio above 0.10 on a busy table is a strong signal that the default 20 percent threshold is too conservative.
Choose a trigger strategy
Autovacuum fires when dead tuples exceed:
threshold + scale_factor * reltuplesWith default settings, a 50-million-row table needs more than 10 million dead tuples before autovacuum triggers. Two common fixes:
- Fixed threshold: Set
autovacuum_vacuum_scale_factor = 0and setautovacuum_vacuum_thresholdto an absolute number of dead tuples, for example 50000. This is predictable on very large tables. - Reduced scale factor: Keep a small scale factor, such as 0.01, so the trigger scales with table growth but fires earlier than the default.
- Fixed threshold: Set
Apply per-table storage parameters
Use
ALTER TABLE ... SET (...)to persist overrides in the catalog.ALTER TABLE events SET ( autovacuum_vacuum_scale_factor = 0.01, autovacuum_vacuum_threshold = 50, autovacuum_analyze_scale_factor = 0.005, autovacuum_analyze_threshold = 50, autovacuum_vacuum_cost_delay = 2 );autovacuum_vacuum_cost_delay = 2matches the modern global default. For tables on fast SSD where bloat matters more than I/O latency, some operators set this to 0 to disable throttling entirely. Do that only after confirming the storage subsystem can absorb the extra load. You can also raiseautovacuum_vacuum_cost_limitper table to let that table’s worker consume more of the global cost budget.Size worker memory
autovacuum_work_memdefaults to -1, so each worker falls back tomaintenance_work_mem. If that fallback is small, a large table with many indexes forces vacuum to scan indexes in multiple passes.ALTER SYSTEM SET autovacuum_work_mem = '1GB';1 GB to 2 GB is a common starting point for high-churn tables with several indexes. Reload the configuration after changing it.
psql -c "SELECT pg_reload_conf();"Tune insert-only and append-only tables
If the table rarely sees UPDATE or DELETE, dead tuples are not the problem, but freeze progress and visibility map maintenance are. From PostgreSQL 13 onward, use
autovacuum_vacuum_insert_scale_factorto trigger vacuum based on insert volume rather than dead tuples.ALTER TABLE logs SET ( autovacuum_vacuum_insert_scale_factor = 0.01, autovacuum_vacuum_insert_threshold = 1000 );For truly insert-only tables, also monitor
age(relfrozenxid)closely because the default dead-tuple threshold may never be reached.Prime the table after bulk loads
After loading or backfilling data, do not wait for autovacuum. Run a manual
VACUUM ANALYZEto update the free space map and statistics immediately.VACUUM ANALYZE events;For very large post-load tables,
VACUUM (DISABLE_PAGE_SKIPPING)ensures a thorough first pass.
flowchart TD
A[Identify high-churn table] --> B{Prefer fixed or relative trigger?}
B -->|Large table, predictable churn| C[Set scale_factor = 0
Set threshold = N tuples]
B -->|Growing table, variable size| D[Set scale_factor = 0.01
Keep threshold low]
C --> E[Set cost_delay per table]
D --> E
E --> F[Size autovacuum_work_mem]
F --> G[Monitor dead_tup and relfrozenxid]Verifying it works
After applying settings, monitor the table over the next hours and days:
Recency: Check that
last_autovacuuminpg_stat_user_tablesadvances after each expected trigger window.Dead tuple trend:
n_dead_tupshould stop growing monotonically and stay below your target.Log output: If
log_autovacuum_min_durationis set, look for completion lines that mention the table and show index cleanup finishing in a single pass. Multiple passes indicate insufficientautovacuum_work_mem.Freeze progress: Query
age(relfrozenxid)frompg_class.SELECT relname, age(relfrozenxid) FROM pg_class WHERE relname = 'events' AND relkind = 'r';The value should increase slowly or step back after vacuum. If it is not resetting, verify that autovacuum is not blocked by a long-running transaction or an abandoned replication slot.
Common pitfalls
Over-tuning every table. Lowering thresholds across the whole database can saturate the autovacuum_max_workers pool (default 3) and cause small tables to starve large ones. Tune only tables that prove they need it.
Ignoring the xmin horizon. If n_dead_tup does not decrease despite active autovacuum workers, something is holding back the transaction horizon: a long-running query, an abandoned replication slot, or an idle in transaction session. No threshold change fixes a blocked horizon. Check pg_stat_activity and pg_replication_slots before assuming the tuning failed.
Undersizing memory. A table with ten indexes and 1 GB autovacuum_work_mem may still need multiple index passes if the dead-tuple map does not fit. If autovacuum logs show repeated index scans, raise the memory or reduce bloat with pg_repack first.
Tuning vacuum while ignoring fillfactor. If the table is update-heavy and fillfactor is 100, PostgreSQL cannot perform HOT updates on the same page. Every update becomes a dead tuple plus a new live tuple, increasing vacuum load. Lowering fillfactor to 85 or 90 leaves room for HOT updates, reducing dead-tuple generation before it reaches vacuum.
Forgetting ANALYZE. Vacuum reclaims space; ANALYZE updates planner statistics. A table that is vacuumed aggressively but analyzed rarely can still suffer plan regressions. Set autovacuum_analyze_scale_factor low alongside the vacuum scale factor.
Signals to monitor
| Signal | Why it matters | Warning sign |
|---|---|---|
pg_stat_user_tables.n_dead_tup / (n_live_tup + n_dead_tup) | Measures bloat pressure per table | > 0.10 sustained on a high-churn table |
pg_stat_user_tables.last_autovacuum | Confirms vacuum reaches the table | Older than autovacuum_naptime * 3 on hot tables |
pg_class.age(relfrozenxid) | Tracks freeze progress | > 500 million XIDs |
pg_stat_activity autovacuum workers | Reveals worker saturation | All workers pinned to one table for > 1 hour |
pg_stat_user_tables.n_tup_upd vs n_dead_tup | Validates vacuum effectiveness | Dead tuples growing faster than update rate implies vacuum is not keeping up |
How Netdata helps
Netdata exposes pg_stat_user_tables metrics, including per-table dead-tuple counts and last-vacuum age, without manual queries. Correlate spikes in pgsql.table_dead_tuples with query latency percentiles to confirm bloat is causing slow reads. Netdata also tracks pgsql.db_transaction_id_age per database, so you can alert well before the wraparound warning threshold. During tuning, use disk I/O charts to verify that more aggressive autovacuum is not saturating the storage subsystem.
Related guides
- How PostgreSQL actually works in production: a mental model for operators: /guides/postgres/how-postgres-works-in-production/
- PostgreSQL ALTER TABLE blocked: zero-downtime DDL patterns: /guides/postgres/postgres-alter-table-blocked/
- PostgreSQL autovacuum not running: detection, causes, and fixes: /guides/postgres/postgres-autovacuum-not-running/
- PostgreSQL blocking queries: finding the root blocker in a lock cascade: /guides/postgres/postgres-blocking-queries/
- PostgreSQL connection exhaustion: detection, diagnosis, and prevention: /guides/postgres/postgres-connection-exhaustion/
- PostgreSQL connection refused: pg_hba, listen_addresses, and TCP diagnosis: /guides/postgres/postgres-connection-refused/
- PostgreSQL deadlock detected: how to diagnose and prevent deadlocks: /guides/postgres/postgres-deadlock-detected/
- PostgreSQL idle in transaction: detecting and killing zombie sessions: /guides/postgres/postgres-idle-in-transaction/
- PostgreSQL ERROR: could not obtain lock — diagnosis and recovery: /guides/postgres/postgres-lock-not-available/
- PostgreSQL monitoring checklist: the signals every production database needs: /guides/postgres/postgres-monitoring-checklist/
- PostgreSQL monitoring maturity model: from reactive to self-healing: /guides/postgres/postgres-monitoring-maturity-model/
- PgBouncer pool exhausted: how to diagnose and fix client waits: /guides/postgres/postgres-pgbouncer-pool-exhausted/






