PostgreSQL checkpoint storms: detection, causes, and tuning

If query latency spikes and TPS drops coincide with your checkpoint_timeout schedule or follow a large bulk load, you are likely hitting a checkpoint storm. PostgreSQL checkpoints guarantee that dirty buffers are on disk. In a healthy system, timed checkpoints occur at checkpoint_timeout intervals and the background writer spreads that I/O. A storm happens when WAL generation hits max_wal_size before the timeout, forcing a requested checkpoint that flushes most dirty buffers at once, saturating disk bandwidth and stalling queries.

flowchart TD
    A[Heavy write load or small max_wal_size] --> B[WAL fills before timeout]
    B --> C[Forced checkpoint requested]
    C --> D[Dirty buffer flush at full speed]
    D --> E[Disk I/O saturation]
    E --> F[Query latency spikes and TPS drops]

Common causes

CauseWhat it looks likeFirst thing to check
Undersized max_wal_sizecheckpoints_req rising steadily alongside checkpoints_timed; logs note WAL-driven checkpointsRatio of checkpoints_req to checkpoints_timed in pg_stat_bgwriter
Short checkpoint_timeout with heavy writesPredictable spikes on a fixed schedule; buffers_checkpoint dominates buffers_cleanpg_stat_bgwriter checkpoint timing and count
Bulk load exceeding WAL limitsSudden spikes during COPY, CREATE INDEX, or REINDEX; cascading forced checkpointspg_stat_activity for large write operations and current WAL LSN growth
Slow storage or excessive dirty pagesHigh disk queue depth during flushes; backends doing their own fsyncOS I/O metrics and buffers_backend_fsync in pg_stat_bgwriter

Quick checks

Run these read-only queries on a live primary. Compare counters against your last statistics reset to judge trends.

# 1. Checkpoint counts. Rising checkpoints_req means WAL is driving checkpoints.
psql -c "SELECT checkpoints_timed, checkpoints_req FROM pg_stat_bgwriter;"
# 2. I/O volume and timing. Large buffers_checkpoint with high write/sync time signals a heavy flush.
psql -c "SELECT buffers_checkpoint, buffers_clean, buffers_backend, checkpoint_write_time, checkpoint_sync_time FROM pg_stat_bgwriter;"
# 3. Current settings.
psql -c "SELECT current_setting('checkpoint_timeout') AS timeout, current_setting('max_wal_size') AS max_wal, current_setting('checkpoint_completion_target') AS target;"
# 4. Estimate WAL generation rate over 60 seconds.
START=$(psql -Atc "SELECT pg_current_wal_lsn()") && sleep 60 && psql -c "SELECT pg_wal_lsn_diff(pg_current_wal_lsn(), '$START') AS wal_bytes"
# 5. OS disk pressure during spikes.
iostat -xz 1 5
# 6. Backends forced to write their own dirty pages.
psql -c "SELECT buffers_backend_fsync FROM pg_stat_bgwriter;"

How to diagnose it

  1. Correlate latency spikes with checkpoint events. Compare application P99 latency with checkpoint counts from pg_stat_bgwriter or PostgreSQL logs. If the spikes align with checkpoint intervals, the storm is confirmed.
  2. Determine if checkpoints are forced. Query pg_stat_bgwriter. If checkpoints_req is more than 10% of checkpoints_timed, max_wal_size is likely too small for your WAL generation rate.
  3. Measure checkpoint duration. Sample checkpoint_write_time and checkpoint_sync_time over an interval, then divide by the number of new checkpoints in that period. Averages over 30 seconds indicate the I/O system cannot absorb the flush.
  4. Check dirty page distribution. If buffers_checkpoint is consistently greater than 70% of total buffers written (buffers_checkpoint + buffers_clean + buffers_backend), the background writer is not keeping up.
  5. Validate storage throughput. Run iostat during a spike. If disk latency jumps above 20 ms and queue depth rises, storage is the bottleneck.
  6. Identify the write source. Query pg_stat_activity for active COPY, CREATE INDEX, REINDEX, or large INSERT ... SELECT statements. If the spike has passed, check PostgreSQL logs for recent completion of large write operations.

Metrics and signals to monitor

SignalWhy it mattersWarning sign
checkpoints_req / checkpoints_timedForced checkpoints mean WAL is filling faster than the timeout intervalRequested checkpoints exceed 10% of timed
checkpoint_write_time + checkpoint_sync_timeTotal duration of disk flush and fsyncPer-checkpoint average over 30 seconds
buffers_checkpoint / total_writesFraction of dirty pages written by checkpoints rather than bgwriter or backendsConsistently above 70%
buffers_backend_fsyncBackends forced to fsync because bgwriter or checkpoint could not keep upNon-zero or rising count
WAL generation rate vs max_wal_sizeWhether WAL fills before the timeoutLSN advances faster than max_wal_size can absorb within checkpoint_timeout

Fixes

Increase max_wal_size

If checkpoints_req is high, raise max_wal_size to absorb write bursts. Start with 4 GB for many production workloads and scale upward for heavy write loads. Monitor disk usage; WAL segments are recycled, but the ceiling must fit your peak burst.

Tradeoff: more disk space consumed by WAL segments, but far fewer forced checkpoints.

Requires a configuration reload.

Lengthen checkpoint_timeout

If spikes are frequent because the interval is too short, increase checkpoint_timeout from the default 5 minutes to 15-30 minutes. This spreads the same dirty-page workload over fewer events.

Tradeoff: more dirty pages accumulate between checkpoints, but each checkpoint is less frequent and easier to smooth.

Requires a configuration reload.

Raise checkpoint_completion_target

Set checkpoint_completion_target to 0.9 so the background writer spreads checkpoint I/O over 90% of the interval. The default on many versions is 0.5. With 0.9, PostgreSQL starts writing dirty pages early and aims to finish 90% of the way through the interval, turning a spike into a sustained, low-rate background task.

Tradeoff: checkpoint writes start earlier and run continuously, slightly increasing baseline I/O but eliminating spikes.

Requires a configuration reload.

Break bulk operations into batches

A single COPY or CREATE INDEX can generate enough WAL to force an immediate checkpoint. Commit between batches so WAL is flushed incrementally rather than forcing a single massive checkpoint.

Tradeoff: slightly more complex application logic, but it avoids cascading forced checkpoints.

Improve storage throughput

If checkpoint duration remains high after tuning parameters, the storage layer is undersized. For new clusters, place WAL on a dedicated volume with low-latency disks to reduce contention between sequential WAL writes and random data page access. For existing clusters, migrating pg_wal requires a maintenance window.

Tradeoff: infrastructure cost.

Prevention

  • Monitor the checkpoint ratio. Alert when checkpoints_req exceeds 10% of checkpoints_timed.
  • Size max_wal_size for peaks, not averages. A bulk load or ETL job should not force a checkpoint.
  • Keep checkpoint_completion_target at 0.9 unless you have verified that your storage handles aggressive writeback.
  • Schedule bulk loads during low-traffic windows.
  • Separate WAL and data volumes at initialization. On existing clusters, migration requires downtime.

How Netdata helps

  • Correlate PostgreSQL checkpoint metrics from pg_stat_bgwriter with per-second disk I/O latency and utilization. If checkpoint counts rise at the same time as disk await spikes, storage saturation is confirmed.
  • Alert when the ratio of requested to timed checkpoints exceeds your threshold, catching undersized max_wal_size before it causes outages.
  • Visualize WAL generation rate alongside replication lag to distinguish primary write pressure from replica apply delays.
  • Track buffer cache efficiency and backend fsync counts to spot background writer stress before it becomes a storm.
  • How PostgreSQL actually works in production: a mental model for operators: /guides/postgres/how-postgres-works-in-production/
  • PostgreSQL ALTER TABLE blocked: zero-downtime DDL patterns: /guides/postgres/postgres-alter-table-blocked/
  • PostgreSQL autovacuum blocked by long-running transaction: detection and fix: /guides/postgres/postgres-autovacuum-blocked-by-long-transaction/
  • PostgreSQL autovacuum not running: detection, causes, and fixes: /guides/postgres/postgres-autovacuum-not-running/
  • PostgreSQL autovacuum tuning: per-table thresholds for high-churn workloads: /guides/postgres/postgres-autovacuum-tuning/
  • PostgreSQL blocking queries: finding the root blocker in a lock cascade: /guides/postgres/postgres-blocking-queries/
  • PostgreSQL connection exhaustion: detection, diagnosis, and prevention: /guides/postgres/postgres-connection-exhaustion/
  • PostgreSQL connection refused: pg_hba, listen_addresses, and TCP diagnosis: /guides/postgres/postgres-connection-refused/
  • PostgreSQL dead tuples piling up: why autovacuum can’t keep up: /guides/postgres/postgres-dead-tuples-piling-up/
  • PostgreSQL deadlock detected: how to diagnose and prevent deadlocks: /guides/postgres/postgres-deadlock-detected/
  • PostgreSQL frozen XID monitoring: catching wraparound 6 months early: /guides/postgres/postgres-frozen-xid-monitoring/
  • PostgreSQL idle in transaction: detecting and killing zombie sessions: /guides/postgres/postgres-idle-in-transaction/