$ guides / postgres / postgres-sequential-scan-on-large-table ▌

Operations Guides

PostgreSQL sequential scan on a large table: when to add an index and when not to

Seeing Seq Scan on a large table in EXPLAIN does not mean the planner is wrong. Adding a B-tree index often makes the query slower, burdens every write path with index maintenance, and wastes storage. PostgreSQL’s planner compares the estimated cost of reading the table sequentially against the estimated cost of traversing an index and fetching rows from the heap. There is no hardcoded row-percentage threshold. On SSD-backed servers, the default cost parameters are frequently stale, so the planner may be wrong in either direction.

What the planner is actually comparing

For every candidate scan path, the planner computes an abstract cost using statistics from ANALYZE and GUC parameters.

Sequential scan cost:

seq_page_cost * table pages + cpu_tuple_cost * rows scanned

Index scan cost:

random_page_cost * index pages visited + random_page_cost * heap pages fetched + cpu_tuple_cost * rows retrieved

The planner picks the cheaper path. The widely cited five-to-ten-percent rule of thumb is emergent behavior from the default seq_page_cost (1.0) and random_page_cost (4.0) on typical hardware, not a hardcoded threshold. The 4.0 default was calibrated for spinning disks. On NVMe or SSD-backed instances, random I/O latency is close enough to sequential I/O that this assumption is outdated. When random_page_cost is left at 4.0 on all-flash storage, the planner overestimates the cost of index scans and falls back to sequential scans for queries that would benefit from an index.

EXPLAIN (ANALYZE, BUFFERS) SELECT * FROM events WHERE created_at > '2025-01-01';

How the cost model decides

flowchart TD
    A[Query received] --> B[Estimate rows returned]
    B --> C{Usable index exists?}
    C -->|No| D[Plan: Seq Scan]
    C -->|Yes| E[Estimate index scan cost]
    E --> F[random_page_cost * index pages + heap fetches]
    B --> G[Estimate seq scan cost]
    G --> H[seq_page_cost * table pages + cpu_tuple_cost * rows]
    F --> I{Index cost < Seq cost?}
    H --> I
    I -->|Yes| J[Plan: Index Scan]
    I -->|No| D

If no index can satisfy the predicate, the planner reads the table sequentially. If an index exists, the planner estimates how many rows match the filter and how scattered the matching rows are across heap pages. For a low-selectivity predicate on a large table, an index scan might touch many heap pages at random. Under the default random_page_cost of 4.0, that random access looks expensive, so the planner prefers a sequential scan. Lowering random_page_cost toward 1.1 or 2.0 on SSD-only storage can flip the plan to an index scan.

Between the two extremes, the planner may choose a bitmap index scan. PostgreSQL scans the index to build a bitmap of matching heap pages, then reads those pages in physical order. This converts random heap access into a more sequential pattern, often beating both a pure index scan and a full sequential scan for moderate selectivity. Parallel bitmap heap scans go further: the leader builds the bitmap while workers read the heap in parallel. The underlying index scan itself is not parallelized.

Parallel sequential scans add another variable. Since PostgreSQL 9.6, the planner can assign multiple workers to scan non-overlapping ranges of the same table. This reduces the effective per-query cost of a sequential scan. The planner only considers parallelism for tables larger than min_parallel_table_scan_size (default 8 MB). On large tables, a parallel sequential scan can outrun an index scan even for moderately selective queries, particularly when the index is not covering and would require many heap lookups.

PostgreSQL 17 introduces a streaming I/O interface that batches multiple read requests during sequential scans, bitmap heap scans, and vacuum. This lowers per-call overhead and makes sequential scans faster than in prior versions. The planner does not account for this at the SQL level, but the observed effect is that sequential scans are more competitive relative to index scans. If you upgrade to PostgreSQL 17 and see more Seq Scan nodes, the planner may not be broken; the underlying sequential read path simply got faster.

Where this shows up in production

SSDs with a stale random_page_cost. A cloud instance backed by NVMe may still have random_page_cost set to 4.0. The planner sees random heap fetches as disproportionately expensive and chooses sequential scans for queries that return as little as three to eight percent of the table. Lowering random_page_cost to 1.1 or 2.0 for SSD-only storage often corrects the bias. On mixed storage tiers, setting it too low causes the planner to choose index scans that perform poorly when data spills to slower disks.

LIMIT clauses with bad estimates. The planner discounts the estimated cost of an index scan when a LIMIT clause is present, assuming it can stop after fetching a fraction of matching rows. If the actual cardinality is much higher than the estimate, the index scan degrades badly. This is not a case where the sequential scan was better; it is a case where the index was chosen based on bad statistics. The correct fix is usually a higher statistics target or a partial index, not index removal.

Broad reporting queries. A quarterly summary that filters a 500 GB event table and returns fifteen to twenty percent of rows is often faster with a sequential scan, especially if PostgreSQL parallelizes it. Forcing an index scan here increases random I/O and CPU overhead without reducing the row count.

Small or memory-resident tables. When a table fits entirely in shared_buffers or the OS page cache, a sequential scan is essentially a memory scan. The planner knows this through effective_cache_size and may correctly prefer a Seq Scan even when an index exists, because the cost difference between random and sequential access collapses in RAM.

Mixed storage tiers. If your database resides on a volume tier that mixes SSD cache with spinning disk, setting random_page_cost universally to 1.1 tells the planner that random I/O is cheap everywhere. When a query touches data aged out to spinning disk, an index scan becomes far more expensive than estimated. Keep random_page_cost conservative in these environments, or use tablespaces to isolate hot data.

When to add an index, and when to leave the seq scan alone

Add an index when the query pattern is stable and selective:

The predicate returns a small fraction of the table. Under default cost parameters, the emergent threshold is roughly under five to ten percent, but verify with EXPLAIN rather than trusting the rule.
You can build a covering index using INCLUDE that eliminates heap fetches entirely and enables an index-only scan.
You can build a partial index that matches a selective, recurring predicate. Partial indexes are smaller to maintain and more likely to stay in cache.
The query is high-frequency OLTP and latency-sensitive.

Do not add an index blindly:

If the query returns more than roughly ten to fifteen percent of rows, sequential I/O is usually more efficient than random heap lookups.
If the table is write-heavy and the query is rare. Every additional index slows INSERT, UPDATE, and DELETE because PostgreSQL must maintain the index tuple and write WAL for it.
If the table is small or fits almost entirely in RAM. A sequential scan from memory is fast, and the planner may be correct to prefer it.
If the query cannot match a partial index you already have. Adding another broad index is over-indexing, which consumes space and write bandwidth.

A common misuse is setting enable_seqscan to off. This does not disable sequential scans; it assigns them an artificially high cost so the planner prefers any alternative. It is a diagnostic flag for verifying whether an index path exists, not a tuning knob. Using it in production hides the real problem, which is usually stale statistics or a random_page_cost that does not match your storage.

Another misuse is indexing low-cardinality columns like a status field with three values. If the filter matches thirty percent of rows, the index will not beat a sequential scan, yet you still pay the write and vacuum overhead.

Signals to watch in production

Signal	Why it matters	Warning sign
`pg_stat_user_tables.seq_scan` vs `idx_scan`	Sustained sequential scans on large tables can mean missing indexes, or legitimate cost decisions	Selective queries show rising `seq_scan` and near-zero `idx_scan`
`EXPLAIN` estimated vs. actual rows	Underestimation causes the planner to prefer a seq scan when an index is better; overestimation causes the opposite	Estimated rows differ from actual by more than 10x
`random_page_cost` setting	Default of 4.0 misleads the planner on SSD/NVMe storage, biasing it toward seq scans	Value is 4.0 on all-SSD instances with no spinning disk tier
`pg_stat_user_tables.idx_scan` for a new index	Confirms the planner is actually using the index	`idx_scan` remains 0 over multiple days after creation
`pg_stat_statements.stddev_time` / `mean_time`	Bimodal latency from plan flapping between seq scan and index scan	Ratio exceeds 1.0 for the same query text

# Check whether an index is being used
SELECT schemaname, relname, seq_scan, idx_scan
FROM pg_stat_user_tables
WHERE relname = 'events';

How Netdata helps

Cross-reference these signals in Netdata to decide whether a sequential scan is a problem or a correct plan:

Correlate pg_stat_user_tables.seq_scan counts with disk I/O metrics. Rising sequential scans alongside saturated disk read throughput indicate the scan is hitting physical storage. Flat disk I/O means the scan is likely memory-resident and harmless.
Track query latency percentiles. A correct sequential scan on a cached table does not produce latency spikes; a sequential scan that evicts shared_buffers does.
Monitor buffer cache hit ratios. A drop in blks_hit / (blks_hit + blks_read) alongside rising seq_scan counts suggests scans are pushing the working set out of cache.
Check table-level scan trends over time to verify whether a newly created index reduced sequential scans or simply added write overhead.
Cross-reference autovacuum and dead tuple metrics with query statistics to distinguish a missing-index problem from bloat that makes sequential scans slower than they should be.

The Netdata solution

PostgreSQL monitoring with Netdata

Netdata monitors PostgreSQL with per-second metrics, pre-built dashboards, and ML-powered anomaly detection. Correlate connection saturation, lock waits, autovacuum progress, replication lag, and checkpoint I/O against the rest of your stack so you catch the incidents in these runbooks before they page anyone.

See PostgreSQL monitoring → Start monitoring free

PostgreSQL sequential scan on a large table: when to add an index and when not to

PostgreSQL sequential scan on a large table: when to add an index and when not to

What the planner is actually comparing

How the cost model decides

Where this shows up in production

When to add an index, and when to leave the seq scan alone

Signals to watch in production

How Netdata helps

Related guides

PostgreSQL monitoring with Netdata