PostgreSQL autovacuum tuning: per-table thresholds for high-churn workloads

Default autovacuum settings target modest OLTP workloads. On a 500-million-row table, the global autovacuum_vacuum_scale_factor of 0.2 means autovacuum ignores the table until dead tuples exceed twenty percent of the row count. For high-churn tables, that delay lets bloat accumulate, degrades indexes, slows scans, and pushes transaction ID age toward wraparound. PostgreSQL overrides these thresholds per table via storage parameters (reloptions), so a 50 GB hot table can run aggressive settings without saturating workers across a 500 MB reference table. This guide covers calculating overrides, sizing worker memory, and verifying vacuum keeps up without drowning the cluster in background I/O.

Prerequisites

PostgreSQL instance with access to pg_stat_user_tables and ALTER TABLE ... SET (...).
Superuser or table owner on target tables.
Disk I/O headroom for more frequent vacuum work. If storage is already saturated, raise autovacuum_vacuum_cost_delay or increase the scale factor before you lower thresholds.
Table-level statistics source: pg_stat_user_tables or a metrics exporter.

Procedure

Identify candidate tables
Query pg_stat_user_tables for tables with high dead-tuple ratios or high absolute dead-tuple counts. Focus on large tables with frequent UPDATE or DELETE.
```
SELECT schemaname, relname,
       n_live_tup, n_dead_tup,
       ROUND(n_dead_tup::numeric / NULLIF(n_live_tup + n_dead_tup, 0), 3) AS dead_ratio
FROM pg_stat_user_tables
WHERE n_dead_tup > 10000
ORDER BY n_dead_tup DESC
LIMIT 10;
```
A sustained dead-tuple ratio above 0.10 on a busy table is a strong signal that the default 20 percent threshold is too conservative.
Choose a trigger strategy
Autovacuum fires when dead tuples exceed:
```
threshold + scale_factor * reltuples
```
With default settings, a 50-million-row table needs more than 10 million dead tuples before autovacuum triggers. Two common fixes:
- Fixed threshold: Set autovacuum_vacuum_scale_factor = 0 and set autovacuum_vacuum_threshold to an absolute number of dead tuples, for example 50000. This is predictable on very large tables.
- Reduced scale factor: Keep a small scale factor, such as 0.01, so the trigger scales with table growth but fires earlier than the default.
Apply per-table storage parameters
Use ALTER TABLE ... SET (...) to persist overrides in the catalog.
```
ALTER TABLE events SET (
    autovacuum_vacuum_scale_factor = 0.01,
    autovacuum_vacuum_threshold = 50,
    autovacuum_analyze_scale_factor = 0.005,
    autovacuum_analyze_threshold = 50,
    autovacuum_vacuum_cost_delay = 2
);
```
autovacuum_vacuum_cost_delay = 2 matches the modern global default. For tables on fast SSD where bloat matters more than I/O latency, some operators set this to 0 to disable throttling entirely. Do that only after confirming the storage subsystem can absorb the extra load. You can also raise autovacuum_vacuum_cost_limit per table to let that table’s worker consume more of the global cost budget.
Size worker memory
autovacuum_work_mem defaults to -1, so each worker falls back to maintenance_work_mem. If that fallback is small, a large table with many indexes forces vacuum to scan indexes in multiple passes.
```
ALTER SYSTEM SET autovacuum_work_mem = '1GB';
```
1 GB to 2 GB is a common starting point for high-churn tables with several indexes. Reload the configuration after changing it.
```
psql -c "SELECT pg_reload_conf();"
```
Tune insert-only and append-only tables
If the table rarely sees UPDATE or DELETE, dead tuples are not the problem, but freeze progress and visibility map maintenance are. From PostgreSQL 13 onward, use autovacuum_vacuum_insert_scale_factor to trigger vacuum based on insert volume rather than dead tuples.
```
ALTER TABLE logs SET (
    autovacuum_vacuum_insert_scale_factor = 0.01,
    autovacuum_vacuum_insert_threshold = 1000
);
```
For truly insert-only tables, also monitor age(relfrozenxid) closely because the default dead-tuple threshold may never be reached.
Prime the table after bulk loads
After loading or backfilling data, do not wait for autovacuum. Run a manual VACUUM ANALYZE to update the free space map and statistics immediately.
```
VACUUM ANALYZE events;
```
For very large post-load tables, VACUUM (DISABLE_PAGE_SKIPPING) ensures a thorough first pass.

flowchart TD
    A[Identify high-churn table] --> B{Prefer fixed or relative trigger?}
    B -->|Large table, predictable churn| C[Set scale_factor = 0
Set threshold = N tuples]
    B -->|Growing table, variable size| D[Set scale_factor = 0.01
Keep threshold low]
    C --> E[Set cost_delay per table]
    D --> E
    E --> F[Size autovacuum_work_mem]
    F --> G[Monitor dead_tup and relfrozenxid]

Verifying it works

After applying settings, monitor the table over the next hours and days:

Recency: Check that last_autovacuum in pg_stat_user_tables advances after each expected trigger window.
Dead tuple trend: n_dead_tup should stop growing monotonically and stay below your target.
Log output: If log_autovacuum_min_duration is set, look for completion lines that mention the table and show index cleanup finishing in a single pass. Multiple passes indicate insufficient autovacuum_work_mem.
Freeze progress: Query age(relfrozenxid) from pg_class.
```
SELECT relname, age(relfrozenxid)
FROM pg_class
WHERE relname = 'events'
  AND relkind = 'r';
```
The value should increase slowly or step back after vacuum. If it is not resetting, verify that autovacuum is not blocked by a long-running transaction or an abandoned replication slot.

Common pitfalls

Over-tuning every table. Lowering thresholds across the whole database can saturate the autovacuum_max_workers pool (default 3) and cause small tables to starve large ones. Tune only tables that prove they need it.

Ignoring the xmin horizon. If n_dead_tup does not decrease despite active autovacuum workers, something is holding back the transaction horizon: a long-running query, an abandoned replication slot, or an idle in transaction session. No threshold change fixes a blocked horizon. Check pg_stat_activity and pg_replication_slots before assuming the tuning failed.

Undersizing memory. A table with ten indexes and 1 GB autovacuum_work_mem may still need multiple index passes if the dead-tuple map does not fit. If autovacuum logs show repeated index scans, raise the memory or reduce bloat with pg_repack first.

Tuning vacuum while ignoring fillfactor. If the table is update-heavy and fillfactor is 100, PostgreSQL cannot perform HOT updates on the same page. Every update becomes a dead tuple plus a new live tuple, increasing vacuum load. Lowering fillfactor to 85 or 90 leaves room for HOT updates, reducing dead-tuple generation before it reaches vacuum.

Forgetting ANALYZE. Vacuum reclaims space; ANALYZE updates planner statistics. A table that is vacuumed aggressively but analyzed rarely can still suffer plan regressions. Set autovacuum_analyze_scale_factor low alongside the vacuum scale factor.

Signals to monitor

Signal	Why it matters	Warning sign
`pg_stat_user_tables.n_dead_tup / (n_live_tup + n_dead_tup)`	Measures bloat pressure per table	> 0.10 sustained on a high-churn table
`pg_stat_user_tables.last_autovacuum`	Confirms vacuum reaches the table	Older than `autovacuum_naptime` * 3 on hot tables
`pg_class.age(relfrozenxid)`	Tracks freeze progress	> 500 million XIDs
`pg_stat_activity` autovacuum workers	Reveals worker saturation	All workers pinned to one table for > 1 hour
`pg_stat_user_tables.n_tup_upd` vs `n_dead_tup`	Validates vacuum effectiveness	Dead tuples growing faster than update rate implies vacuum is not keeping up

How Netdata helps

Netdata exposes pg_stat_user_tables metrics, including per-table dead-tuple counts and last-vacuum age, without manual queries. Correlate spikes in pgsql.table_dead_tuples with query latency percentiles to confirm bloat is causing slow reads. Netdata also tracks pgsql.db_transaction_id_age per database, so you can alert well before the wraparound warning threshold. During tuning, use disk I/O charts to verify that more aggressive autovacuum is not saturating the storage subsystem.

How PostgreSQL actually works in production: a mental model for operators: /guides/postgres/how-postgres-works-in-production/
PostgreSQL ALTER TABLE blocked: zero-downtime DDL patterns: /guides/postgres/postgres-alter-table-blocked/
PostgreSQL autovacuum not running: detection, causes, and fixes: /guides/postgres/postgres-autovacuum-not-running/
PostgreSQL blocking queries: finding the root blocker in a lock cascade: /guides/postgres/postgres-blocking-queries/
PostgreSQL connection exhaustion: detection, diagnosis, and prevention: /guides/postgres/postgres-connection-exhaustion/
PostgreSQL connection refused: pg_hba, listen_addresses, and TCP diagnosis: /guides/postgres/postgres-connection-refused/
PostgreSQL deadlock detected: how to diagnose and prevent deadlocks: /guides/postgres/postgres-deadlock-detected/
PostgreSQL idle in transaction: detecting and killing zombie sessions: /guides/postgres/postgres-idle-in-transaction/
PostgreSQL ERROR: could not obtain lock — diagnosis and recovery: /guides/postgres/postgres-lock-not-available/
PostgreSQL monitoring checklist: the signals every production database needs: /guides/postgres/postgres-monitoring-checklist/
PostgreSQL monitoring maturity model: from reactive to self-healing: /guides/postgres/postgres-monitoring-maturity-model/
PgBouncer pool exhausted: how to diagnose and fix client waits: /guides/postgres/postgres-pgbouncer-pool-exhausted/

The Netdata solution

PostgreSQL monitoring with Netdata

Netdata monitors PostgreSQL with per-second metrics, pre-built dashboards, and ML-powered anomaly detection. Correlate connection saturation, lock waits, autovacuum progress, replication lag, and checkpoint I/O against the rest of your stack so you catch the incidents in these runbooks before they page anyone.

See PostgreSQL monitoring → Start monitoring free

PostgreSQL autovacuum tuning: per-table thresholds for high-churn workloads

PostgreSQL autovacuum tuning: per-table thresholds for high-churn workloads

Prerequisites

Procedure

Verifying it works

Common pitfalls

Signals to monitor

How Netdata helps

Related guides

PostgreSQL monitoring with Netdata