$ guides / postgres / postgres-checkpoint-storms ▌

Operations Guides

PostgreSQL checkpoint storms: detection, causes, and tuning

If query latency spikes and TPS drops coincide with your checkpoint_timeout schedule or follow a large bulk load, you are likely hitting a checkpoint storm. PostgreSQL checkpoints guarantee that dirty buffers are on disk. In a healthy system, timed checkpoints occur at checkpoint_timeout intervals and the background writer spreads that I/O. A storm happens when WAL generation hits max_wal_size before the timeout, forcing a requested checkpoint that flushes most dirty buffers at once, saturating disk bandwidth and stalling queries.

flowchart TD
    A[Heavy write load or small max_wal_size] --> B[WAL fills before timeout]
    B --> C[Forced checkpoint requested]
    C --> D[Dirty buffer flush at full speed]
    D --> E[Disk I/O saturation]
    E --> F[Query latency spikes and TPS drops]

Common causes

Cause	What it looks like	First thing to check
Undersized `max_wal_size`	`checkpoints_req` rising steadily alongside `checkpoints_timed`; logs note WAL-driven checkpoints	Ratio of `checkpoints_req` to `checkpoints_timed` in `pg_stat_bgwriter`
Short `checkpoint_timeout` with heavy writes	Predictable spikes on a fixed schedule; `buffers_checkpoint` dominates `buffers_clean`	`pg_stat_bgwriter` checkpoint timing and count
Bulk load exceeding WAL limits	Sudden spikes during `COPY`, `CREATE INDEX`, or `REINDEX`; cascading forced checkpoints	`pg_stat_activity` for large write operations and current WAL LSN growth
Slow storage or excessive dirty pages	High disk queue depth during flushes; backends doing their own fsync	OS I/O metrics and `buffers_backend_fsync` in `pg_stat_bgwriter`

Quick checks

Run these read-only queries on a live primary. Compare counters against your last statistics reset to judge trends.

# 1. Checkpoint counts. Rising checkpoints_req means WAL is driving checkpoints.
psql -c "SELECT checkpoints_timed, checkpoints_req FROM pg_stat_bgwriter;"

# 2. I/O volume and timing. Large buffers_checkpoint with high write/sync time signals a heavy flush.
psql -c "SELECT buffers_checkpoint, buffers_clean, buffers_backend, checkpoint_write_time, checkpoint_sync_time FROM pg_stat_bgwriter;"

# 3. Current settings.
psql -c "SELECT current_setting('checkpoint_timeout') AS timeout, current_setting('max_wal_size') AS max_wal, current_setting('checkpoint_completion_target') AS target;"

# 4. Estimate WAL generation rate over 60 seconds.
START=$(psql -Atc "SELECT pg_current_wal_lsn()") && sleep 60 && psql -c "SELECT pg_wal_lsn_diff(pg_current_wal_lsn(), '$START') AS wal_bytes"

# 5. OS disk pressure during spikes.
iostat -xz 1 5

# 6. Backends forced to write their own dirty pages.
psql -c "SELECT buffers_backend_fsync FROM pg_stat_bgwriter;"

How to diagnose it

Correlate latency spikes with checkpoint events. Compare application P99 latency with checkpoint counts from pg_stat_bgwriter or PostgreSQL logs. If the spikes align with checkpoint intervals, the storm is confirmed.
Determine if checkpoints are forced. Query pg_stat_bgwriter. If checkpoints_req is more than 10% of checkpoints_timed, max_wal_size is likely too small for your WAL generation rate.
Measure checkpoint duration. Sample checkpoint_write_time and checkpoint_sync_time over an interval, then divide by the number of new checkpoints in that period. Averages over 30 seconds indicate the I/O system cannot absorb the flush.
Check dirty page distribution. If buffers_checkpoint is consistently greater than 70% of total buffers written (buffers_checkpoint + buffers_clean + buffers_backend), the background writer is not keeping up.
Validate storage throughput. Run iostat during a spike. If disk latency jumps above 20 ms and queue depth rises, storage is the bottleneck.
Identify the write source. Query pg_stat_activity for active COPY, CREATE INDEX, REINDEX, or large INSERT ... SELECT statements. If the spike has passed, check PostgreSQL logs for recent completion of large write operations.

Metrics and signals to monitor

Signal	Why it matters	Warning sign
`checkpoints_req / checkpoints_timed`	Forced checkpoints mean WAL is filling faster than the timeout interval	Requested checkpoints exceed 10% of timed
`checkpoint_write_time + checkpoint_sync_time`	Total duration of disk flush and fsync	Per-checkpoint average over 30 seconds
`buffers_checkpoint / total_writes`	Fraction of dirty pages written by checkpoints rather than bgwriter or backends	Consistently above 70%
`buffers_backend_fsync`	Backends forced to fsync because bgwriter or checkpoint could not keep up	Non-zero or rising count
WAL generation rate vs `max_wal_size`	Whether WAL fills before the timeout	LSN advances faster than `max_wal_size` can absorb within `checkpoint_timeout`

Fixes

Increase max_wal_size

If checkpoints_req is high, raise max_wal_size to absorb write bursts. Start with 4 GB for many production workloads and scale upward for heavy write loads. Monitor disk usage; WAL segments are recycled, but the ceiling must fit your peak burst.

Tradeoff: more disk space consumed by WAL segments, but far fewer forced checkpoints.

Requires a configuration reload.

Lengthen checkpoint_timeout

If spikes are frequent because the interval is too short, increase checkpoint_timeout from the default 5 minutes to 15-30 minutes. This spreads the same dirty-page workload over fewer events.

Tradeoff: more dirty pages accumulate between checkpoints, but each checkpoint is less frequent and easier to smooth.

Requires a configuration reload.

Raise checkpoint_completion_target

Set checkpoint_completion_target to 0.9 so the background writer spreads checkpoint I/O over 90% of the interval. The default on many versions is 0.5. With 0.9, PostgreSQL starts writing dirty pages early and aims to finish 90% of the way through the interval, turning a spike into a sustained, low-rate background task.

Tradeoff: checkpoint writes start earlier and run continuously, slightly increasing baseline I/O but eliminating spikes.

Requires a configuration reload.

Break bulk operations into batches

A single COPY or CREATE INDEX can generate enough WAL to force an immediate checkpoint. Commit between batches so WAL is flushed incrementally rather than forcing a single massive checkpoint.

Tradeoff: slightly more complex application logic, but it avoids cascading forced checkpoints.

Improve storage throughput

If checkpoint duration remains high after tuning parameters, the storage layer is undersized. For new clusters, place WAL on a dedicated volume with low-latency disks to reduce contention between sequential WAL writes and random data page access. For existing clusters, migrating pg_wal requires a maintenance window.

Tradeoff: infrastructure cost.

Prevention

Monitor the checkpoint ratio. Alert when checkpoints_req exceeds 10% of checkpoints_timed.
Size max_wal_size for peaks, not averages. A bulk load or ETL job should not force a checkpoint.
Keep checkpoint_completion_target at 0.9 unless you have verified that your storage handles aggressive writeback.
Schedule bulk loads during low-traffic windows.
Separate WAL and data volumes at initialization. On existing clusters, migration requires downtime.

How Netdata helps

Correlate PostgreSQL checkpoint metrics from pg_stat_bgwriter with per-second disk I/O latency and utilization. If checkpoint counts rise at the same time as disk await spikes, storage saturation is confirmed.
Alert when the ratio of requested to timed checkpoints exceeds your threshold, catching undersized max_wal_size before it causes outages.
Visualize WAL generation rate alongside replication lag to distinguish primary write pressure from replica apply delays.
Track buffer cache efficiency and backend fsync counts to spot background writer stress before it becomes a storm.

How PostgreSQL actually works in production: a mental model for operators: /guides/postgres/how-postgres-works-in-production/
PostgreSQL ALTER TABLE blocked: zero-downtime DDL patterns: /guides/postgres/postgres-alter-table-blocked/
PostgreSQL autovacuum blocked by long-running transaction: detection and fix: /guides/postgres/postgres-autovacuum-blocked-by-long-transaction/
PostgreSQL autovacuum not running: detection, causes, and fixes: /guides/postgres/postgres-autovacuum-not-running/
PostgreSQL autovacuum tuning: per-table thresholds for high-churn workloads: /guides/postgres/postgres-autovacuum-tuning/
PostgreSQL blocking queries: finding the root blocker in a lock cascade: /guides/postgres/postgres-blocking-queries/
PostgreSQL connection exhaustion: detection, diagnosis, and prevention: /guides/postgres/postgres-connection-exhaustion/
PostgreSQL connection refused: pg_hba, listen_addresses, and TCP diagnosis: /guides/postgres/postgres-connection-refused/
PostgreSQL dead tuples piling up: why autovacuum can’t keep up: /guides/postgres/postgres-dead-tuples-piling-up/
PostgreSQL deadlock detected: how to diagnose and prevent deadlocks: /guides/postgres/postgres-deadlock-detected/
PostgreSQL frozen XID monitoring: catching wraparound 6 months early: /guides/postgres/postgres-frozen-xid-monitoring/
PostgreSQL idle in transaction: detecting and killing zombie sessions: /guides/postgres/postgres-idle-in-transaction/

The Netdata solution

PostgreSQL monitoring with Netdata

Netdata monitors PostgreSQL with per-second metrics, pre-built dashboards, and ML-powered anomaly detection. Correlate connection saturation, lock waits, autovacuum progress, replication lag, and checkpoint I/O against the rest of your stack so you catch the incidents in these runbooks before they page anyone.

See PostgreSQL monitoring → Start monitoring free

PostgreSQL checkpoint storms: detection, causes, and tuning

PostgreSQL checkpoint storms: detection, causes, and tuning

Common causes

Quick checks

How to diagnose it

Metrics and signals to monitor

Fixes

Increase max_wal_size

Lengthen checkpoint_timeout

Raise checkpoint_completion_target

Break bulk operations into batches

Improve storage throughput

Prevention

How Netdata helps

Related guides

PostgreSQL monitoring with Netdata