PostgreSQL index bloat: detection and REINDEX CONCURRENTLY recovery
Queries that used to return in milliseconds are now spiking to hundreds. Disk usage grows faster than insert volume. EXPLAIN shows a bitmap index scan pulling thousands of heap pages for a selective filter. The cause is usually index bloat: dead pages and fragmentation in B-tree indexes. Unlike table bloat, index bloat is not visible in pg_stat_user_tables, and it often degrades query latency or fills a volume before you notice it. Detect bloat with pgstattuple and recover online with REINDEX CONCURRENTLY.
What this means
PostgreSQL B-tree indexes store entries in 8 KB pages. When rows are updated or deleted, the index entries become dead, but the pages are not immediately reclaimed. VACUUM marks index tuples as dead, yet page-level fragmentation remains because new entries do not perfectly backfill the gaps. The result is low average leaf density. Many pages contain a mix of live and dead tuples, or are entirely dead but still allocated. A bloated index consumes extra disk space, pollutes the buffer cache, and forces bitmap scans to visit more heap pages than necessary. Over time, the index can grow larger than its table.
flowchart TD
A["Index size > table size or latency spike"] --> B{"Run pgstatindex"}
B -->|"avg_leaf_density < 70%"| C["Significant bloat confirmed"]
B -->|"avg_leaf_density >= 70%"| D["Check table stats and HOT ratio"]
C --> E{"PG version and index type"}
E -->|"PG12+ B-tree"| F["REINDEX INDEX CONCURRENTLY"]
E -->|"Exclusion or PG11"| G["Maintenance window or pg_repack"]
F --> H{"Success?"}
H -->|"No, _ccnew left"| I["Drop invalid _ccnew and retry"]
H -->|"Yes, _ccold left"| J["Drop invalid _ccold remnant"]
D --> K["Tune autovacuum or fillfactor"]Common causes
| Cause | What it looks like | First thing to check |
|---|---|---|
| High UPDATE/DELETE rate on indexed columns | Index size grows faster than table size; index scan latency rises | pg_stat_user_tables.n_tup_upd and n_tup_del vs insert rate |
| Long-running transactions blocking vacuum | Dead tuples persist despite active autovacuum workers | Oldest xact_start or backend_xid in pg_stat_activity |
| Autovacuum settings too conservative for table churn | Dead tuple ratio > 20% on high-write tables; last_autovacuum is stale | Per-table autovacuum_vacuum_scale_factor and last_autovacuum timestamp |
| Low HOT update ratio | Every update touches indexes even when non-indexed heap columns change | n_tup_hot_upd / n_tup_upd in pg_stat_user_tables |
| Excessive indexes on a high-write table | Write amplification and vacuum overhead spread across many indexes | pg_stat_user_indexes.idx_scan vs idx_tup_read for unused indexes |
Quick checks
These commands are read-only and safe to run during an incident.
# Find indexes that are larger than their table
psql -c "SELECT schemaname, relname, indexrelname,
pg_size_pretty(pg_relation_size(indexrelid)) AS idx_size,
pg_size_pretty(pg_relation_size(relid)) AS tbl_size
FROM pg_stat_user_indexes
WHERE pg_relation_size(indexrelid) > pg_relation_size(relid)
ORDER BY pg_relation_size(indexrelid) DESC
LIMIT 10;"
# Identify long-running transactions that may block vacuum
psql -c "SELECT pid, usename, state,
EXTRACT(EPOCH FROM (now() - xact_start)) AS xact_age_sec
FROM pg_stat_activity
WHERE xact_start IS NOT NULL
AND EXTRACT(EPOCH FROM (now() - xact_start)) > 300
ORDER BY xact_start;"
# Review dead tuple counts and last autovacuum per table
psql -c "SELECT schemaname, relname, n_dead_tup, n_live_tup, last_autovacuum
FROM pg_stat_user_tables
WHERE n_dead_tup > 10000
ORDER BY n_dead_tup DESC
LIMIT 10;"
# Measure B-tree index bloat with pgstatindex (requires pgstattuple extension)
psql -c "SELECT avg_leaf_density, leaf_fragmentation, deleted_pages
FROM pgstatindex('public.my_index');"
# List invalid indexes, including leftovers from failed concurrent rebuilds
psql -c "SELECT n.nspname AS schema, c.relname AS index_name, i.indisvalid AS valid
FROM pg_index i
JOIN pg_class c ON c.oid = i.indexrelid
JOIN pg_namespace n ON n.oid = c.relnamespace
WHERE NOT i.indisvalid;"
How to diagnose it
- Spot size anomalies. Query
pg_stat_user_indexesfor indexes larger than their parent table. Joinpg_classto comparereltuplesagainst index size if you need a row-count estimate. - Confirm bloat with
pgstatindex. Install thepgstattupleextension if it is not already present. RunSELECT avg_leaf_density, leaf_fragmentation, deleted_pages FROM pgstatindex('idx_name');. Bloat percentage is roughly100 - avg_leaf_density. Values below 70% combined with elevateddeleted_pagesindicate significant bloat. Access topgstatindexis restricted to members of thepg_stat_scan_tablesrole; superusers bypass this. - Correlate with table activity. Check
pg_stat_user_tablesfor highn_tup_updandn_tup_delon the underlying table. A lown_tup_hot_updratio means updates are not heap-only, so they generate index churn. - Check for blockers. Verify that no long-running transactions or
idle in transactionsessions are preventing vacuum from reclaiming dead tuples and index pages. - Inspect for invalid remnants. Query
pg_indexforindisvalid = false. A failedREINDEX CONCURRENTLYmay have left an invalid_ccnewindex that consumes space without providing any benefit.
Metrics and signals to monitor
| Signal | Why it matters | Warning sign |
|---|---|---|
pg_relation_size(indexrelid) / pg_relation_size(relid) | Bloated indexes can exceed table size on disk | Ratio > 1.0 for non-unique indexes |
avg_leaf_density from pgstatindex | Measures packing efficiency of leaf pages | Below 70% indicates significant bloat |
leaf_fragmentation from pgstatindex | Measures physical ordering degradation | Above 30% suggests recovery will help |
pg_stat_user_tables.n_dead_tup | Dead tuples drive index bloat | Growing steadily on indexed tables |
pg_stat_activity.xact_start age | Long transactions block vacuum and freeze | Any backend older than 5 minutes |
pg_stat_user_indexes.idx_scan | Unused indexes still incur write and vacuum overhead | idx_scan = 0 over a full business cycle |
Fixes
Rebuild online with REINDEX CONCURRENTLY (PostgreSQL 12+)
REINDEX INDEX CONCURRENTLY idx_name;
This is the preferred fix for B-tree index bloat on PostgreSQL 12 and later. It takes a SHARE UPDATE EXCLUSIVE lock instead of ACCESS EXCLUSIVE, allowing concurrent inserts, updates, and deletes. The command performs two table scans and a final catalog swap. It cannot be run inside a transaction block. Only one concurrent index build per table is permitted at a time.
To rebuild all indexes on a table:
REINDEX TABLE CONCURRENTLY table_name;
Note that this skips exclusion-constraint indexes. You must handle those separately.
Exclusion constraints. REINDEX CONCURRENTLY raises an error if you target an exclusion-constraint index directly, and skips it when reindexing a table or database. Non-concurrent REINDEX is required for these indexes, so plan a maintenance window.
System catalogs. REINDEX SYSTEM does not support CONCURRENTLY.
Recover from a failed REINDEX CONCURRENTLY
If the rebuild fails mid-operation (for example, due to a uniqueness violation found during the second scan), PostgreSQL leaves behind an invalid index. A transient new index is suffixed _ccnew; if the original index could not be dropped after a successful rebuild, it is suffixed _ccold. Numeric suffixes such as _ccnew1 may appear to keep names unique.
Warning: DROP INDEX is destructive. Verify the index name and confirm it is invalid before dropping.
For an invalid _ccnew index, drop it and retry:
DROP INDEX idx_name_ccnew;
REINDEX INDEX CONCURRENTLY idx_name;
For an invalid _ccold index, the rebuild succeeded and the old index is no longer needed. Drop the remnant:
DROP INDEX idx_name_ccold;
When REINDEX CONCURRENTLY is unavailable
On PostgreSQL 11 and earlier, REINDEX CONCURRENTLY is not available. Options include:
CREATE INDEX CONCURRENTLYwith a swap. Build a new index with a temporary name concurrently, drop the old index, and rename the new one. This requires updating any foreign-key or constraint references that depend on the old index name. Caution: This does not work for indexes backingPRIMARY KEYorUNIQUEconstraints without dropping and recreating the constraint.pg_repack. Thepg_repackextension can reorganize a table and its indexes online without an extended exclusive lock. It requires the table to have a primary key or a unique NOT NULL index, plus roughly twice the disk space of the target table and its indexes. The final swap acquires a briefACCESS EXCLUSIVElock that is typically measured in milliseconds, not proportional to table size.
Non-B-tree and specialized indexes
pgstatindex works only for B-tree indexes. For GIN, GiST, BRIN, and HASH indexes, monitor physical size growth over time as the primary signal.
GIN indexes deserve special attention. When fastupdate is enabled (the default), inserts go to a pending list that is later flushed by autovacuum or ANALYZE. A large pending list slows searches and consumes space. You can force a flush with gin_clean_pending_list() or consider fastupdate = off if insert latency is acceptable and search performance is critical.
Prevention
- Aggressive per-table autovacuum. Lower
autovacuum_vacuum_scale_factorto 0.01–0.05 on high-churn tables so dead tuples are reclaimed before they fragment index pages. - HOT-friendly fillfactor. Set
fillfactorto 85 or 90 on update-heavy tables to leave room for heap-only tuple updates, which avoid index maintenance when indexed columns do not change. - Remove unused indexes. Every index slows writes and adds vacuum overhead. Drop indexes with zero scans over a full business cycle.
- Short transaction timeouts. Set
idle_in_transaction_session_timeoutto 60–300 seconds to prevent abandoned transactions from pinning dead tuples and blocking vacuum. - GIN pending list limits. Monitor GIN pending list growth. If searches degrade, use
gin_clean_pending_list()or disablefastupdatefor the index.
How Netdata helps
- Index size metrics from
pg_stat_user_indexeshighlight disproportionate growth against the parent table. - Dead tuple ratio per table from
pg_stat_user_tableswarns that bloat pressure is building before scans degrade. pg_stat_statementslatency percentiles flag plan regressions from bloated indexes, such as bitmap scans that fetch excessive heap pages.- Connection state charts surface
idle in transactionsessions that block autovacuum. - Custom alerts on index-to-table size ratios or dead tuple percentages can trigger before you need emergency REINDEX.
Related guides
- How PostgreSQL actually works in production: a mental model for operators
- PostgreSQL ALTER TABLE blocked: zero-downtime DDL patterns
- PostgreSQL blocking queries: finding the root blocker in a lock cascade
- PostgreSQL connection exhaustion: detection, diagnosis, and prevention
- PostgreSQL connection refused: pg_hba, listen_addresses, and TCP diagnosis
- PostgreSQL deadlock detected: how to diagnose and prevent deadlocks
- PostgreSQL idle in transaction: detecting and killing zombie sessions
- PostgreSQL ERROR: could not obtain lock – diagnosis and recovery
- PostgreSQL monitoring checklist: the signals every production database needs
- PostgreSQL monitoring maturity model: from reactive to self-healing
- PgBouncer pool exhausted: how to diagnose and fix client waits
- PgBouncer vs Pgpool-II vs Odyssey: choosing a PostgreSQL connection pooler






