MySQL Innodb_buffer_pool_wait_free > 0: buffer pool memory pressure
Innodb_buffer_pool_wait_free increments when InnoDB must synchronously flush dirty pages to make room for new reads. A sustained nonzero rate means queries are waiting on disk writes before they can proceed. Unlike the buffer pool hit ratio, which can stay above 99% while the system stalls, wait_free confirms the buffer pool is operating at its limit.
This is the Buffer Pool Cliff pattern: once the working set exceeds available clean pages, performance degrades non-linearly, disk I/O saturates, and threads pile up.
What this means
Innodb_buffer_pool_wait_free counts how often InnoDB needed a free page, found none, and waited while dirty pages were flushed. Under normal operation, background page cleaning keeps queries unblocked. When the counter rises, either the background flush cannot keep up, or the pool is too small for the working set.
The counter is cumulative. Measure the rate of change over a 60-second window. Brief spikes during bursty writes are tolerable; a sustained nonzero rate is not normal and adds latency to affected queries.
The typical cascade: dirty pages accumulate, free pages drop to zero, reads trigger synchronous flushes, query latency increases, threads pile up, and disk I/O saturates.
flowchart TD A[Dirty pages accumulate] --> B[Free pages exhausted] B --> C[Innodb_buffer_pool_wait_free increments] C --> D[Synchronous flush on read] D --> E[Query latency spikes] E --> F[Threads_running piles up] F --> G[Disk I/O saturates] G --> H[Throughput collapse]
Common causes
| Cause | What it looks like | First thing to check |
|---|---|---|
| Buffer pool too small for working set | Hit ratio degrading over days; wait_free increments during peak traffic | Innodb_buffer_pool_reads rate relative to Innodb_buffer_pool_read_requests |
| Write rate exceeds flush capacity | Dirty page ratio climbing; checkpoint age trending toward redo log capacity | Innodb_buffer_pool_pages_dirty ratio and checkpoint age in SHOW ENGINE INNODB STATUS |
| Slow I/O subsystem | Innodb_data_pending_writes or pending fsyncs elevated; OS disk %util high | iostat -x on the data and redo log volumes |
| Large table scans evicting hot pages | Sudden hit ratio drop after a deploy or batch job; specific queries on large tables | SHOW PROCESSLIST for active queries against large tables |
| Cold buffer pool after restart | Hit ratio low immediately after restart; improves over 30-60 minutes | Uptime and whether innodb_buffer_pool_load_at_startup is enabled |
Quick checks
Run these read-only checks to confirm the scope of the pressure.
# Check current wait_free and compare again in 60 seconds to get the rate
mysql -e "SHOW GLOBAL STATUS LIKE 'Innodb_buffer_pool_wait_free';"
# Buffer pool hit ratio over the current uptime
mysql -e "SELECT (1 - (a.VARIABLE_VALUE / b.VARIABLE_VALUE)) * 100 AS hit_pct FROM performance_schema.global_status a JOIN performance_schema.global_status b WHERE a.VARIABLE_NAME = 'Innodb_buffer_pool_reads' AND b.VARIABLE_NAME = 'Innodb_buffer_pool_read_requests';"
# Dirty page count, free pages, and total pages
mysql -e "SELECT VARIABLE_NAME, VARIABLE_VALUE FROM performance_schema.global_status WHERE VARIABLE_NAME IN ('Innodb_buffer_pool_pages_dirty', 'Innodb_buffer_pool_pages_free', 'Innodb_buffer_pool_pages_total');"
# Check for long-running transactions that may be blocking purge or holding scans
mysql -e "SELECT trx_mysql_thread_id, trx_started, TIMESTAMPDIFF(SECOND, trx_started, NOW()) AS age_seconds, LEFT(trx_query, 80) FROM information_schema.INNODB_TRX ORDER BY trx_started ASC LIMIT 5;"
# InnoDB internals: look for 'Pending writes: LRU' and 'Free buffers'
mysql -e "SHOW ENGINE INNODB STATUS\G" | grep -A2 -E "Pending writes|Free buffers"
# Current queries that may be scanning large tables
mysql -e "SHOW PROCESSLIST;"
# Buffer pool configuration and uptime
mysql -e "SHOW GLOBAL VARIABLES WHERE Variable_name IN ('innodb_buffer_pool_size', 'innodb_buffer_pool_instances', 'innodb_buffer_pool_load_at_startup');"
mysql -e "SHOW GLOBAL STATUS LIKE 'Uptime';"
How to diagnose it
Confirm the rate is sustained. Sample
Innodb_buffer_pool_wait_freeat two points 60 seconds apart. A single increment during a write burst is noise. A steady rate is a blockage.Check the buffer pool hit ratio. Compute
1 - (reads / read_requests)using the cumulative values since uptime, or sample twice for an interval measurement. If the ratio is falling below 99% for OLTP, the working set has outgrown the pool.Inspect dirty page pressure. If
Innodb_buffer_pool_pages_dirtyis a large fraction of total pages and climbing, the page cleaners cannot drain dirty pages fast enough. Cross-check checkpoint age fromSHOW ENGINE INNODB STATUSby subtractingLast checkpoint atfromLog sequence numberin the LOG section. If checkpoint age is above 75% of total redo log capacity, flushing is the bottleneck.Look for scan pollution. In
SHOW PROCESSLIST, identify queries against large tables. A full table scan from a report or export can evict hot pages and drivewait_freeeven if the working set normally fits.Validate I/O capacity. Run
iostat -x 1 5on the data and redo log volumes. If%utilis pinned andawaitis elevated, the storage layer cannot service the flush rate regardless of configuration.Rule out cold start. If
Uptimeis under 30 minutes and hit ratio is recovering, the counter may reflect normal warmup. Ensureinnodb_buffer_pool_dump_at_shutdownandinnodb_buffer_pool_load_at_startupare enabled to reduce this window.Correlate with connection state. Rising
Threads_runningalongsidewait_freeconfirms the stall is impacting concurrent queries. IfThreads_connectedis high butThreads_runningis low, the issue is likely elsewhere.
Metrics and signals to monitor
| Signal | Why it matters | Warning sign |
|---|---|---|
Innodb_buffer_pool_wait_free rate | Direct measure of synchronous flush waits | Sustained nonzero rate over 5 minutes |
| Buffer pool hit ratio (interval) | Whether the working set fits in memory | < 99% sustained for OLTP |
Innodb_buffer_pool_pages_dirty ratio | Dirty page accumulation vs flush capacity | Trending toward innodb_max_dirty_pages_pct |
| Checkpoint age / redo log capacity | Proximity to forced synchronous checkpoint flush | > 75% of capacity sustained |
Innodb_buffer_pool_reads rate | Physical disk reads caused by pool misses | Exponential increase over baseline |
Threads_running | Active query concurrency | Spiking while Questions rate drops |
| Disk read I/O utilization | Storage saturation from missed reads | %util approaching 100% with high await |
Fixes
Increase the buffer pool size
If the working set has genuinely grown, increase innodb_buffer_pool_size. In MySQL 5.7.5 and later, this is dynamic:
SET GLOBAL innodb_buffer_pool_size = <bytes>;
Warning: Online buffer pool resizing can cause a temporary stall while memory is reallocated. Execute during a low-traffic window and watch Threads_running.
Tradeoff: Resist allocating more than 60-80% of total RAM. The operating system and per-connection buffers still need headroom. Monitor OS swap and OOM behavior after resizing.
Accelerate dirty page flushing
If dirty pages are the constraint and storage has unused IOPS, raise innodb_io_capacity and innodb_io_capacity_max to let the page cleaners flush more aggressively. Verify that innodb_page_cleaners is sized appropriately for your buffer pool instance count and core availability.
Tradeoff: Higher flush rates consume disk bandwidth and can increase read latency under mixed workloads.
Isolate or kill scan queries
If a specific query is polluting the pool, kill the connection:
KILL <thread_id>;
Warning: Killing a connection aborts its active transaction and rolls back uncommitted work. Identify the query in INNODB_TRX or PROCESSLIST first.
For recurring reports or exports, move them to a replica or add an appropriate index to avoid full table scans.
Tradeoff: Adding indexes requires DDL and additional storage overhead.
Upgrade storage or separate I/O paths
If iostat shows the disk subsystem is saturated, move the redo log, binlog, or data files to faster or dedicated storage. Ensure the redo log device supports low-latency synchronous writes.
Tradeoff: Infrastructure changes may require a maintenance window and data directory reconfiguration.
Prevention
- Monitor
Innodb_buffer_pool_wait_freerate continuously. Do not rely solely on hit ratio; it is a lagging indicator for acute pressure. - Size the buffer pool to absorb peak working set plus 10-20% headroom. Track data growth rate against pool size for runway estimation.
- Enable
innodb_buffer_pool_dump_at_shutdownandinnodb_buffer_pool_load_at_startupto shorten post-restart warmup and avoid false-positive pressure signals. - Run large analytical scans against read replicas instead of the primary.
- Review slow query logs for emerging full-table scans before they begin evicting hot pages.
How Netdata helps
- Netdata collects
Innodb_buffer_pool_wait_freeand charts the derived rate, so you do not need to sample manually. - Correlate
wait_freespikes with buffer pool hit ratio,Innodb_buffer_pool_reads, disk I/O utilization, andThreads_runningin one view to confirm the cascade. - Alert on sustained nonzero
wait_freerates with built-in thresholds. - Visualize InnoDB dirty page ratio and checkpoint age alongside OS memory metrics to distinguish buffer pool pressure from external memory pressure.
Related guides
- How MySQL actually works in production: a mental model for operators
- MySQL Aborted_connects and Aborted_clients climbing: diagnosis
- MySQL InnoDB checkpoint age: the redo log capacity signal nobody watches
- MySQL connection exhaustion: detection, diagnosis, and prevention
- MySQL innodb_deadlock_detect=OFF: when deadlock detection becomes the bottleneck
- MySQL ERROR 1213: Deadlock found when trying to get lock; try restarting transaction
- MySQL FLUSH TABLES WITH READ LOCK stall: backups that freeze the server
- MySQL gap locks and next-key locks: surprising deadlocks under REPEATABLE READ
- MySQL ERROR 1205: Lock wait timeout exceeded; try restarting transaction
- MySQL Innodb_log_waits > 0: the log buffer is too small (not a checkpoint stall)
- MySQL long-running transactions: detecting and killing the silent blocker
- MySQL: got a packet bigger than ‘max_allowed_packet’ bytes - causes and fixes







