MySQL Innodb_buffer_pool_wait_free > 0: buffer pool memory pressure

Innodb_buffer_pool_wait_free increments when InnoDB must synchronously flush dirty pages to make room for new reads. A sustained nonzero rate means queries are waiting on disk writes before they can proceed. Unlike the buffer pool hit ratio, which can stay above 99% while the system stalls, wait_free confirms the buffer pool is operating at its limit.

This is the Buffer Pool Cliff pattern: once the working set exceeds available clean pages, performance degrades non-linearly, disk I/O saturates, and threads pile up.

What this means

Innodb_buffer_pool_wait_free counts how often InnoDB needed a free page, found none, and waited while dirty pages were flushed. Under normal operation, background page cleaning keeps queries unblocked. When the counter rises, either the background flush cannot keep up, or the pool is too small for the working set.

The counter is cumulative. Measure the rate of change over a 60-second window. Brief spikes during bursty writes are tolerable; a sustained nonzero rate is not normal and adds latency to affected queries.

The typical cascade: dirty pages accumulate, free pages drop to zero, reads trigger synchronous flushes, query latency increases, threads pile up, and disk I/O saturates.

flowchart TD
  A[Dirty pages accumulate] --> B[Free pages exhausted]
  B --> C[Innodb_buffer_pool_wait_free increments]
  C --> D[Synchronous flush on read]
  D --> E[Query latency spikes]
  E --> F[Threads_running piles up]
  F --> G[Disk I/O saturates]
  G --> H[Throughput collapse]

Common causes

CauseWhat it looks likeFirst thing to check
Buffer pool too small for working setHit ratio degrading over days; wait_free increments during peak trafficInnodb_buffer_pool_reads rate relative to Innodb_buffer_pool_read_requests
Write rate exceeds flush capacityDirty page ratio climbing; checkpoint age trending toward redo log capacityInnodb_buffer_pool_pages_dirty ratio and checkpoint age in SHOW ENGINE INNODB STATUS
Slow I/O subsystemInnodb_data_pending_writes or pending fsyncs elevated; OS disk %util highiostat -x on the data and redo log volumes
Large table scans evicting hot pagesSudden hit ratio drop after a deploy or batch job; specific queries on large tablesSHOW PROCESSLIST for active queries against large tables
Cold buffer pool after restartHit ratio low immediately after restart; improves over 30-60 minutesUptime and whether innodb_buffer_pool_load_at_startup is enabled

Quick checks

Run these read-only checks to confirm the scope of the pressure.

# Check current wait_free and compare again in 60 seconds to get the rate
mysql -e "SHOW GLOBAL STATUS LIKE 'Innodb_buffer_pool_wait_free';"
# Buffer pool hit ratio over the current uptime
mysql -e "SELECT (1 - (a.VARIABLE_VALUE / b.VARIABLE_VALUE)) * 100 AS hit_pct FROM performance_schema.global_status a JOIN performance_schema.global_status b WHERE a.VARIABLE_NAME = 'Innodb_buffer_pool_reads' AND b.VARIABLE_NAME = 'Innodb_buffer_pool_read_requests';"
# Dirty page count, free pages, and total pages
mysql -e "SELECT VARIABLE_NAME, VARIABLE_VALUE FROM performance_schema.global_status WHERE VARIABLE_NAME IN ('Innodb_buffer_pool_pages_dirty', 'Innodb_buffer_pool_pages_free', 'Innodb_buffer_pool_pages_total');"
# Check for long-running transactions that may be blocking purge or holding scans
mysql -e "SELECT trx_mysql_thread_id, trx_started, TIMESTAMPDIFF(SECOND, trx_started, NOW()) AS age_seconds, LEFT(trx_query, 80) FROM information_schema.INNODB_TRX ORDER BY trx_started ASC LIMIT 5;"
# InnoDB internals: look for 'Pending writes: LRU' and 'Free buffers'
mysql -e "SHOW ENGINE INNODB STATUS\G" | grep -A2 -E "Pending writes|Free buffers"
# Current queries that may be scanning large tables
mysql -e "SHOW PROCESSLIST;"
# Buffer pool configuration and uptime
mysql -e "SHOW GLOBAL VARIABLES WHERE Variable_name IN ('innodb_buffer_pool_size', 'innodb_buffer_pool_instances', 'innodb_buffer_pool_load_at_startup');"
mysql -e "SHOW GLOBAL STATUS LIKE 'Uptime';"

How to diagnose it

  1. Confirm the rate is sustained. Sample Innodb_buffer_pool_wait_free at two points 60 seconds apart. A single increment during a write burst is noise. A steady rate is a blockage.

  2. Check the buffer pool hit ratio. Compute 1 - (reads / read_requests) using the cumulative values since uptime, or sample twice for an interval measurement. If the ratio is falling below 99% for OLTP, the working set has outgrown the pool.

  3. Inspect dirty page pressure. If Innodb_buffer_pool_pages_dirty is a large fraction of total pages and climbing, the page cleaners cannot drain dirty pages fast enough. Cross-check checkpoint age from SHOW ENGINE INNODB STATUS by subtracting Last checkpoint at from Log sequence number in the LOG section. If checkpoint age is above 75% of total redo log capacity, flushing is the bottleneck.

  4. Look for scan pollution. In SHOW PROCESSLIST, identify queries against large tables. A full table scan from a report or export can evict hot pages and drive wait_free even if the working set normally fits.

  5. Validate I/O capacity. Run iostat -x 1 5 on the data and redo log volumes. If %util is pinned and await is elevated, the storage layer cannot service the flush rate regardless of configuration.

  6. Rule out cold start. If Uptime is under 30 minutes and hit ratio is recovering, the counter may reflect normal warmup. Ensure innodb_buffer_pool_dump_at_shutdown and innodb_buffer_pool_load_at_startup are enabled to reduce this window.

  7. Correlate with connection state. Rising Threads_running alongside wait_free confirms the stall is impacting concurrent queries. If Threads_connected is high but Threads_running is low, the issue is likely elsewhere.

Metrics and signals to monitor

SignalWhy it mattersWarning sign
Innodb_buffer_pool_wait_free rateDirect measure of synchronous flush waitsSustained nonzero rate over 5 minutes
Buffer pool hit ratio (interval)Whether the working set fits in memory< 99% sustained for OLTP
Innodb_buffer_pool_pages_dirty ratioDirty page accumulation vs flush capacityTrending toward innodb_max_dirty_pages_pct
Checkpoint age / redo log capacityProximity to forced synchronous checkpoint flush> 75% of capacity sustained
Innodb_buffer_pool_reads ratePhysical disk reads caused by pool missesExponential increase over baseline
Threads_runningActive query concurrencySpiking while Questions rate drops
Disk read I/O utilizationStorage saturation from missed reads%util approaching 100% with high await

Fixes

Increase the buffer pool size

If the working set has genuinely grown, increase innodb_buffer_pool_size. In MySQL 5.7.5 and later, this is dynamic:

SET GLOBAL innodb_buffer_pool_size = <bytes>;

Warning: Online buffer pool resizing can cause a temporary stall while memory is reallocated. Execute during a low-traffic window and watch Threads_running.

Tradeoff: Resist allocating more than 60-80% of total RAM. The operating system and per-connection buffers still need headroom. Monitor OS swap and OOM behavior after resizing.

Accelerate dirty page flushing

If dirty pages are the constraint and storage has unused IOPS, raise innodb_io_capacity and innodb_io_capacity_max to let the page cleaners flush more aggressively. Verify that innodb_page_cleaners is sized appropriately for your buffer pool instance count and core availability.

Tradeoff: Higher flush rates consume disk bandwidth and can increase read latency under mixed workloads.

Isolate or kill scan queries

If a specific query is polluting the pool, kill the connection:

KILL <thread_id>;

Warning: Killing a connection aborts its active transaction and rolls back uncommitted work. Identify the query in INNODB_TRX or PROCESSLIST first.

For recurring reports or exports, move them to a replica or add an appropriate index to avoid full table scans.

Tradeoff: Adding indexes requires DDL and additional storage overhead.

Upgrade storage or separate I/O paths

If iostat shows the disk subsystem is saturated, move the redo log, binlog, or data files to faster or dedicated storage. Ensure the redo log device supports low-latency synchronous writes.

Tradeoff: Infrastructure changes may require a maintenance window and data directory reconfiguration.

Prevention

  • Monitor Innodb_buffer_pool_wait_free rate continuously. Do not rely solely on hit ratio; it is a lagging indicator for acute pressure.
  • Size the buffer pool to absorb peak working set plus 10-20% headroom. Track data growth rate against pool size for runway estimation.
  • Enable innodb_buffer_pool_dump_at_shutdown and innodb_buffer_pool_load_at_startup to shorten post-restart warmup and avoid false-positive pressure signals.
  • Run large analytical scans against read replicas instead of the primary.
  • Review slow query logs for emerging full-table scans before they begin evicting hot pages.

How Netdata helps

  • Netdata collects Innodb_buffer_pool_wait_free and charts the derived rate, so you do not need to sample manually.
  • Correlate wait_free spikes with buffer pool hit ratio, Innodb_buffer_pool_reads, disk I/O utilization, and Threads_running in one view to confirm the cascade.
  • Alert on sustained nonzero wait_free rates with built-in thresholds.
  • Visualize InnoDB dirty page ratio and checkpoint age alongside OS memory metrics to distinguish buffer pool pressure from external memory pressure.