PostgreSQL temp file explosion: work_mem tuning and disk pressure

Disk usage on your PostgreSQL primary is climbing by gigabytes per hour. Query latency has doubled, iowait is spiking, and you have not deployed new code. The database is writing large temporary files to disk. In-memory operations such as sorts, hash joins, and materialization are spilling because they exceed the per-operation memory budget. The root cause is almost always a mismatch between work_mem and actual query memory demand, compounded by the fact that PostgreSQL applies this limit per plan node, not per query or session.

What this means

PostgreSQL executes queries using a tree of plan nodes. When a node needs to sort, hash, or materialize rows, it allocates memory up to work_mem. If the input exceeds that limit, the executor spills overflow to temporary files on disk. Because work_mem is per operation, a single query with four hash joins and two sorts can consume six times work_mem in that backend alone. With parallel query active, each worker gets its own allocation, multiplying usage further. Many backends spilling simultaneously can saturate disk I/O, trigger checkpoint pressure, and even cause the Linux OOM killer to terminate the postmaster if work_mem is set too aggressively.

flowchart TD
    A[Sort or hash operation] --> B{Memory exceeds work_mem?}
    B -->|Yes| C[Spill to temp file]
    B -->|No| D[In-memory execution]
    C --> E[Disk I/O increases]
    E --> F[Query latency spikes]
    C --> G[Multiple ops per query]
    G --> H[Memory usage multiplies]
    H --> I[OOM killer risk]

Common causes

CauseWhat it looks likeFirst thing to check
work_mem too low for query complexitypg_stat_database.temp_bytes growing rapidly; long-running queries with sorts or hashestemp_bytes and temp_files deltas in pg_stat_database
work_mem set too high causing OOMPostmaster killed by OOM killer in dmesg; cluster restarts; temp files appear after recoverySystem OOM logs and pg_stat_activity backend count
Parallel query multiplying memory usageSpikes correlate with high CPU; plans show Gather nodesmax_parallel_workers_per_gather and active worker count
Missing indexes forcing large sortsSequential scans on large tables paired with Sort nodespg_stat_user_tables.seq_scan vs. idx_scan
Long-running analytical queries on the primarySudden temp growth during batch jobs or reportspg_stat_activity for active queries with long query_start

Quick checks

# Check cumulative temp file volume per database
psql -c "SELECT datname, temp_files, pg_size_pretty(temp_bytes) FROM pg_stat_database WHERE temp_bytes > 0 ORDER BY temp_bytes DESC;"
# Find long-running active queries that may be spilling
psql -c "SELECT pid, usename, state, query_start, LEFT(query, 100) AS query FROM pg_stat_activity WHERE state = 'active' AND query_start < NOW() - INTERVAL '2 minutes';"
# Inspect memory and parallelism settings
psql -c "SELECT name, setting, unit FROM pg_settings WHERE name IN ('work_mem', 'max_connections', 'max_parallel_workers_per_gather');"
# Identify large tables using sequential scans
psql -c "SELECT schemaname, relname, seq_scan, idx_scan, n_live_tup FROM pg_stat_user_tables WHERE n_live_tup > 100000 ORDER BY seq_scan DESC LIMIT 10;"
# Check data volume disk usage
df -h "$(psql -Atc 'SHOW data_directory;')"
# Check for OOM kills (may require root)
dmesg | grep -i "out of memory"

How to diagnose it

  1. Confirm temp file volume is abnormal. Take two snapshots of pg_stat_database five minutes apart and compare temp_bytes and temp_files. A sustained increase confirms active spilling. These counters reset on PostgreSQL crash or if you run pg_stat_reset().
  2. Identify contributing queries. If pg_stat_statements is enabled, sort by total_exec_time and look for queries containing sorts, hashes, or window functions. Correlate execution times with the temp growth window.
  3. Inspect query plans. Capture an auto_explain log for a representative query, or run EXPLAIN (ANALYZE, BUFFERS) inside a transaction that you roll back. Warning: EXPLAIN ANALYZE executes the query. Do not use it on DML (INSERT, UPDATE, DELETE) unless you understand the side effects. Even inside a rolled-back transaction, DML takes locks, fires triggers, and modifies rows. Look for Sort Method: external merge Disk: ... or Hash node Batches: ... greater than one. High temp read and temp written values in the BUFFERS output also confirm spill activity.
  4. Check work_mem and parallelism. Verify the global work_mem value and whether the query uses parallel workers. Each worker allocates the same per-operation budget independently.
  5. Look for missing indexes. If pg_stat_user_tables shows high seq_scan counts on large tables involved in the slow queries, the planner may be sorting rows that an index could return pre-ordered.
  6. Review OS memory pressure. Calculate approximate worst-case memory usage as work_mem multiplied by typical plan nodes per query, active backends, and parallel workers (max_parallel_workers_per_gather + 1 including the leader). If this approaches available RAM, OOM is the likely next failure.

Metrics and signals to monitor

SignalWhy it mattersWarning sign
pg_stat_database.temp_bytesCumulative temp file write volumeGrowing faster than baseline over 15 minutes
pg_stat_database.temp_filesCount of temp files createdSustained increase indicating repeated spills
Disk utilization (data volume)Temp files share disk with data and WAL> 80% and climbing during peak load
pg_stat_activity long active queriesSpills occur during query executionActive queries older than 60 seconds with sort or hash patterns
seq_scan / idx_scan ratioMissing indexes force materialization and sortHigh seq_scan on tables with > 100k rows
OOM kills (dmesg)Excessive work_mem causes memory overcommitPostmaster or backends killed by OOM killer

Fixes

Right-size work_mem

Do not set work_mem based on RAM / max_connections. That calculation ignores the per-operation multiplier. A conservative global default is 4-16 MB. For known analytical roles, raise work_mem per role with ALTER ROLE ... SET work_mem = '64MB' rather than globally. Tradeoff: lower values increase disk spill and query latency; higher values risk OOM.

Add indexes to eliminate sorts

If plans show sequential scans followed by sorts on large tables, a covering index or partial index may let the planner return rows in the desired order without a separate sort node. Tradeoff: indexes slow writes and consume storage.

Reduce parallel worker multiplication

If the query plan shows a Gather node and temp spikes correlate with CPU saturation, lower max_parallel_workers_per_gather globally or per-query with SET max_parallel_workers_per_gather = 0. Tradeoff: queries run longer but use less aggregate memory per backend.

Isolate temp I/O

If temp file volume is large and sustained, route temp files to a dedicated tablespace using the temp_tablespaces setting. Place the tablespace on fast ephemeral storage to prevent temp I/O from competing with WAL and data file I/O on the same volume. Tradeoff: additional storage to manage; this does not reduce total temp volume.

Recover from OOM

If the Linux OOM killer terminated the postmaster while temp files were active, work_mem is likely too high for the concurrency level. Reduce work_mem in postgresql.conf, restart PostgreSQL, then reconnect applications. Verify with the memory formula in Prevention.

Prevention

  • Log temp file creation. Set log_temp_files to a threshold matching your work_mem budget so you catch spills early without flooding logs.
  • Weekly query review. Review pg_stat_statements for new queries with high total_exec_time and sort or hash patterns.
  • Memory budget math. Size work_mem so that work_mem multiplied by typical plan nodes per query, active backends, and parallel workers stays safely below available RAM.
  • Connection pooling. Use a pooler to limit active backends, which indirectly caps total memory usage.
  • Query timeouts. Set statement_timeout and idle_in_transaction_session_timeout to prevent runaway queries from holding temp files indefinitely.

How Netdata helps

Netdata correlates pg_stat_database.temp_bytes and temp_files with disk I/O wait and utilization on the same node. It alerts on disk utilization above 80% to catch pressure before queries fail. It tracks pg_stat_activity active connection states and query start times to identify long-running operations that coincide with temp spikes. It monitors system memory usage and Linux OOM kill events alongside PostgreSQL backend counts, and surfaces pg_stat_user_tables sequential scan rates to flag missing index candidates that drive sorts.