MySQL Innodb_log_waits > 0: the log buffer is too small (not a checkpoint stall)

InnoDB increments Innodb_log_waits every time a thread waits because the log buffer is full. A sustained nonzero rate means the buffer cannot absorb your write burst, or the storage layer cannot drain it fast enough.

This is not a checkpoint stall. Checkpoint stalls happen when the on-disk redo log fills and InnoDB forces synchronous dirty-page flushing. Innodb_log_waits measures in-memory log buffer pressure only. Confusing the two leads to tuning redo log file size while the real problem persists.

What this means

InnoDB writes redo records to the log buffer before commit. The buffer is sized by innodb_log_buffer_size. When a thread cannot reserve space, it waits for the log writer to flush to disk. Each wait increments Innodb_log_waits.

In a healthy system, this counter stays at zero. A sustained positive rate means the buffer is undersized for the write rate, or storage fsync is too slow. A single large transaction can also spike the counter by generating more redo than the buffer holds in one shot.

Checkpoint age pressure is a different mechanism. Checkpoint age tracks how much on-disk redo log space has been written since the last checkpoint. When it nears capacity, InnoDB stalls writes to flush dirty pages. That stall is measured by checkpoint age, not Innodb_log_waits. Both can coexist, but Innodb_log_waits alone does not indicate a checkpoint stall.

flowchart TD
    A[Innodb_log_waits > 0] --> B{Checkpoint age high?}
    B -->|Yes| C[Checkpoint stall
address redo capacity
and flush rate] B -->|No| D{Pending fsyncs > 0?} D -->|Yes| E[Storage fsync bottleneck
check disk latency] D -->|No| F[Log buffer too small] F --> G[Check INNODB_TRX
for large transactions] G --> H[Increase buffer
or split commits]

Common causes

CauseWhat it looks likeFirst thing to check
innodb_log_buffer_size too smallSustained Innodb_log_waits during normal write load; rate scales with transaction throughputSHOW GLOBAL VARIABLES LIKE 'innodb_log_buffer_size';
Slow redo log fsyncInnodb_log_waits paired with Innodb_os_log_pending_fsyncs > 0 or rising disk write latencySHOW GLOBAL STATUS LIKE 'Innodb_os_log_pending_fsyncs'; plus OS disk metrics
Large transactionBrief spike correlating with a single transaction modifying many rowsSELECT * FROM information_schema.INNODB_TRX ORDER BY trx_rows_modified DESC;

Quick checks

Run these read-only checks to confirm the signal and separate it from a checkpoint stall.

-- Confirm the counter and current buffer size
SHOW GLOBAL STATUS LIKE 'Innodb_log_waits';
SHOW GLOBAL VARIABLES LIKE 'innodb_log_buffer_size';
-- Check redo write rate and pending I/O
SHOW GLOBAL STATUS LIKE 'Innodb_os_log_written';
SHOW GLOBAL STATUS LIKE 'Innodb_os_log_pending_fsyncs';
SHOW GLOBAL STATUS LIKE 'Innodb_os_log_pending_writes';
-- Check for large active transactions
SELECT trx_id, trx_mysql_thread_id, trx_started, trx_rows_modified, trx_query
FROM information_schema.INNODB_TRX
ORDER BY trx_rows_modified DESC LIMIT 5;
-- MySQL 8.0.30+: compute checkpoint age to rule out stall
SELECT
  CAST((SELECT VARIABLE_VALUE FROM performance_schema.global_status WHERE VARIABLE_NAME = 'Innodb_redo_log_current_lsn') AS UNSIGNED) -
  CAST((SELECT VARIABLE_VALUE FROM performance_schema.global_status WHERE VARIABLE_NAME = 'Innodb_redo_log_checkpoint_lsn') AS UNSIGNED)
  AS checkpoint_age;
-- Pre-8.0.30: read checkpoint age from INNODB_METRICS
-- Note: enable the metric first if it returns zero
SELECT `COUNT` FROM INFORMATION_SCHEMA.INNODB_METRICS WHERE NAME = 'log_lsn_checkpoint_age';
# OS-level write latency on the redo log device
iostat -x 1 5

How to diagnose it

  1. Confirm the signal is sustained. Take two samples of Innodb_log_waits one minute apart. A single increment during a burst may be harmless. A steady climb means active pressure.

  2. Rule out a checkpoint stall. Compute checkpoint age as a ratio of total redo log capacity. In MySQL 8.0.30+, capacity is innodb_redo_log_capacity. In earlier versions, it is innodb_log_file_size multiplied by innodb_log_files_in_group. If the ratio is below 75%, the redo log files are not the problem. If it is above 75%, you have a checkpoint stall in addition to, or instead of, a buffer issue.

  3. Check for storage fsync pressure. If Innodb_os_log_pending_fsyncs or Innodb_os_log_pending_writes is sustained above zero, the disk subsystem cannot keep up with the flush rate. This causes the buffer to back up even if it is nominally large enough.

  4. Look for a large transaction. Query information_schema.INNODB_TRX ordered by trx_rows_modified. A transaction with millions of modified rows can fill the log buffer regardless of the configured size.

  5. Baseline the write rate. Divide the delta of Innodb_os_log_written by the elapsed seconds between samples. If the rate is high and steady, the buffer is simply too small for your workload.

Metrics and signals to monitor

SignalWhy it mattersWarning sign
Innodb_log_waits rateDirect indicator of log buffer pressureAny sustained nonzero rate
Innodb_os_log_pending_fsyncsStorage cannot keep up with redo durability requestsSustained nonzero value
Innodb_os_log_pending_writesStorage cannot keep up with redo log write volumeSustained nonzero value
Checkpoint age / redo capacity ratioDistinguishes buffer waits from checkpoint stall; both may coexist> 75% of capacity
Innodb_os_log_written rateQuantifies the write load driving buffer turnoverSudden burst correlating with waits
Disk write latency on redo deviceReveals storage-level slowness independent of buffer sizeLatency spikes correlating with wait episodes

Fixes

Increase the log buffer size

If checkpoint age is healthy and storage latency is low, increase innodb_log_buffer_size. The default in MySQL 8.0 is 16 MB. Production write-heavy workloads often need more headroom. This variable is not dynamic; schedule a restart to apply the change.

Reduce redo log fsync latency

If Innodb_os_log_pending_fsyncs is sustained above zero, the bottleneck is storage throughput or latency, not buffer size. Investigate the disk subsystem hosting the redo log. In MySQL 8.0.30+, redo log files live in the #innodb_redo/ directory inside the data directory. In earlier versions, they are the ib_logfile* files in the data directory. If the underlying storage is shared, degraded, or misconfigured, moving to dedicated, low-latency storage may be necessary.

Warning: Reducing fsync frequency by changing innodb_flush_log_at_trx_commit sacrifices durability and risks data loss on crash. Do not use this as a permanent fix.

Split large transactions

If a single transaction in INNODB_TRX is responsible for the spike, break the work into smaller commits. Each commit flushes the buffer, preventing it from filling monotonically during one massive operation.

Prevention

  • Monitor Innodb_log_waits as a binary health signal. Zero should be the steady state.
  • Baseline Innodb_os_log_written during your peak write window before deploying new write-heavy features.
  • Track OS-level disk write latency on the volume hosting the redo log so storage degradation does not masquerade as a buffer sizing issue.
  • Review application batch jobs and ETL processes to ensure they commit in bounded-size chunks.

How Netdata helps

  • Netdata collects Innodb_log_waits and Innodb_os_log_pending_fsyncs from SHOW GLOBAL STATUS.
  • Disk latency charts on the redo log partition are correlated with the MySQL metrics, distinguishing a storage bottleneck from an undersized buffer.
  • Checkpoint age is tracked alongside log waits, preventing the common misdiagnosis of checkpoint stalls.
  • Per-second resolution catches brief spikes from large transactions that slower polling intervals might smooth over.