MySQL slow after restart: buffer pool warm-up and the cold cache

You restart MySQL for maintenance, crash recovery, or a failover. Seconds later, query latency jumps from sub-millisecond to tens or hundreds of milliseconds. Disk read I/O saturates. The buffer pool hit ratio that normally sits above 99% has collapsed. There are no runaway queries, no lock waits, and no checkpoint stalls. The server is simply cold.

After any restart, the InnoDB buffer pool starts empty. Every data page read misses memory and goes to disk, so Innodb_buffer_pool_reads spikes. If the adaptive hash index is enabled, it is empty too. The server may accept client connections long before the pool is ready to serve production traffic. The issue looks like a capacity crisis, but it is usually transient and expected. Distinguishing a cold cache from a true capacity problem keeps you from chasing the wrong fixes during an incident.

What this means

InnoDB stores data and index pages in a fixed-size buffer pool, typically 60-80% of RAM. Under normal operation, frequently accessed pages stay in this LRU cache, making reads fast. After a restart, the pool is empty. All logical reads that would have been satisfied from memory become physical disk reads. This is true even if innodb_buffer_pool_load_at_startup is enabled, because the background loader must still pull pages from disk, and it runs concurrently with incoming traffic.

MySQL 5.7 and later enable innodb_buffer_pool_dump_at_shutdown and innodb_buffer_pool_load_at_startup by default. At shutdown, MySQL writes the hottest page identifiers, tablespace ID and page ID pairs, to a small file named ib_buffer_pool in the data directory. At startup, a background thread reads this list and reloads those pages. However, the default innodb_buffer_pool_dump_pct is 25, meaning only the hottest quarter of the occupied pool is dumped, not the full configured size. If the pool was half full, you dump roughly 12.5% of total capacity. The loader also cannot restore pages that were not present at the last clean shutdown, so a crash restart loads from a potentially stale list.

The adaptive hash index is also cleared on restart. On MySQL versions before 8.4, where AHI is enabled by default, point lookups lose acceleration until the hash structure rebuilds from access patterns. On MySQL 8.4 and later, AHI is disabled by default, so this dimension only matters if you have explicitly re-enabled it.

Critically, MySQL begins accepting connections before the buffer pool load finishes. Traffic that hits the server immediately after restart runs against a cold cache regardless of your warm-up settings.

flowchart TD
    A[MySQL restart] --> B[Server accepts connections]
    B --> C{Dump/load enabled?}
    C -->|Yes| D[Background thread loads page IDs from ib_buffer_pool]
    C -->|No| E[Buffer pool empty]
    D --> F[Queries still issue physical reads]
    E --> F
    F --> G[Working set populates LRU]
    G --> H[Hit ratio recovers]

Common causes

CauseWhat it looks likeFirst thing to check
Normal cold start with default or disabled dump/loadUptime near zero; Innodb_buffer_pool_reads spiking; hit ratio < 50%SHOW GLOBAL VARIABLES LIKE 'innodb_buffer_pool_load%';
Background load still in progressConnections accepted but latency remains high; disk I/O activeSHOW GLOBAL STATUS LIKE 'Innodb_buffer_pool_load_status';
Crash recovery using stale dumpError log shows crash recovery; loaded pages may no longer be hotError log for recovery messages; workload pattern age
Replica promoted to primaryReplica Uptime is low after failover; old primary was warm but new one is coldWhether the replica had traffic before promotion
AHI warming (pre-8.4 or explicitly enabled)Point lookups slower than usual despite acceptable hit ratio; CPU not saturatedSHOW ENGINE INNODB STATUS hash search ratio

Quick checks

-- Confirm restart time
SHOW GLOBAL STATUS LIKE 'Uptime';

-- Check if background buffer pool load is active or done
SHOW GLOBAL STATUS LIKE 'Innodb_buffer_pool_load_status';

-- Check physical vs logical reads
SHOW GLOBAL STATUS LIKE 'Innodb_buffer_pool_read%';

-- Verify dump/load configuration
SHOW GLOBAL VARIABLES LIKE 'innodb_buffer_pool_dump%';
SHOW GLOBAL VARIABLES LIKE 'innodb_buffer_pool_load%';

-- Check AHI metrics if enabled (MySQL 8.0+)
SELECT NAME, COUNT FROM INFORMATION_SCHEMA.INNODB_METRICS
WHERE NAME IN ('adaptive_hash_searches', 'adaptive_hash_searches_btree');

-- Check active concurrency to rule out overload
SHOW GLOBAL STATUS LIKE 'Threads_running';

How to diagnose it

  1. Confirm the restart. SHOW GLOBAL STATUS LIKE 'Uptime'; returns a low value. Cross-check with the error log for startup or crash recovery messages.
  2. Calculate the hit ratio. Use Innodb_buffer_pool_read_requests (logical) and Innodb_buffer_pool_reads (physical). If physical reads dominate and uptime is low, the cache is cold.
  3. Check load status. SHOW GLOBAL STATUS LIKE 'Innodb_buffer_pool_load_status'; tells you whether the background loader finished, is still running, or never started. On a busy server you may see load completed but latency still elevated if the dumped set did not cover your current working set.
  4. Inspect configuration. innodb_buffer_pool_dump_at_shutdown and innodb_buffer_pool_load_at_startup should both be ON. If either is OFF, the warm-up is entirely natural.
  5. Review the dump percentage. SHOW GLOBAL VARIABLES LIKE 'innodb_buffer_pool_dump_pct'; shows how much of the pool is saved. The default of 25 may be insufficient for large or fragmented working sets.
  6. Evaluate crash vs clean shutdown. If the previous shutdown was a crash, the ib_buffer_pool file reflects the last clean shutdown state. Stale or unnecessary pages may be loaded, and recently hot pages are missing.
  7. Check AHI state. If innodb_adaptive_hash_index is ON, look at SHOW ENGINE INNODB STATUS under “INSERT BUFFER AND ADAPTIVE HASH INDEX.” Near-zero hash searches with high non-hash searches means AHI is still cold. This is normal for the first 30 minutes.
  8. Rule out capacity problems. If the hit ratio does not recover toward normal baselines after 30-60 minutes of production traffic, the working set may genuinely exceed the buffer pool size. See MySQL InnoDB buffer pool hit ratio collapse: the cliff edge.

Metrics and signals to monitor

SignalWhy it mattersWarning sign
Innodb_buffer_pool_reads rateDirect measure of physical disk readsSpikes > 10x baseline with low Uptime
Buffer pool hit ratioPercentage of reads served from memoryRemains below 95% more than 60 minutes after restart
Innodb_buffer_pool_load_statusIndicates whether background warm-up is activeShows “not started” when load is expected, or hangs indefinitely
UptimeConfirms a restart eventSudden reset paired with read spikes
AHI hash search ratio (if enabled)Tracks hash index effectivenessNear zero for more than 30 minutes post-restart
Threads_runningDistinguishes warm-up from overloadStays within normal bounds while hit ratio is low

Fixes

Wait for natural warm-up

If you can tolerate the latency, let production traffic populate the LRU. The hottest pages migrate to the young sublist naturally. Tradeoff: latency stays elevated until the working set is fully resident, which on large pools can take 30 minutes or more.

Enable and tune dump/load

Ensure innodb_buffer_pool_dump_at_shutdown = ON and innodb_buffer_pool_load_at_startup = ON. Raise innodb_buffer_pool_dump_pct to save a larger fraction of the pool. Tradeoff: larger dump files and longer startup load times, but a warmer cache sooner.

Trigger an online load

If the server is already running and has a recent ib_buffer_pool file, run:

SET GLOBAL innodb_buffer_pool_load_now = ON;

This starts the background loader immediately. Tradeoff: it adds physical I/O contention while running. You can abort it with:

SET GLOBAL innodb_buffer_pool_load_abort = ON;

Pre-warm replicas before failover

A replica’s ib_buffer_pool reflects its own page list and space IDs, not the source’s. Before promoting a replica, run representative SELECT queries against it to populate its buffer pool with the current working set. Tradeoff: requires scripting and application knowledge, but prevents a cold-cache surprise during cutover.

Handle AHI separately

If you run MySQL pre-8.4 with AHI enabled, accept that point lookups will be slower for the first 30 minutes. If you see AHI latch contention (btr_search waits in SHOW ENGINE INNODB STATUS) during warm-up, consider disabling it dynamically:

SET GLOBAL innodb_adaptive_hash_index = OFF;

Tradeoff: you lose the lookup acceleration, but you avoid mutex contention on a cold structure.

Handle compressed tables

If you use compressed tables, pages load into the buffer pool in compressed form and are decompressed on access. The single background loader thread is inefficient for this. In that case, manual warm-up via representative SELECT queries in separate connections can be more effective than the built-in load. Tradeoff: uses more connection threads during the warm-up window.

Prevention

  • Enable innodb_buffer_pool_dump_at_shutdown and innodb_buffer_pool_load_at_startup on all production instances.
  • Suppress buffer pool hit-ratio alerts for 30-60 minutes after any restart. Use Uptime as a guard condition in alert rules.
  • For planned failovers, script a pre-warming phase against the target replica before promotion.
  • Periodically refresh the dump file with SET GLOBAL innodb_buffer_pool_dump_now = ON; so crash recovery starts from a recent page list.
  • Include Innodb_buffer_pool_load_status in readiness checks if your orchestration routes traffic immediately after port bind.
  • Size the buffer pool large enough that even a partial dump covers the critical working set.

How Netdata helps

  • Charts Innodb_buffer_pool_reads, hit ratio, and Uptime together so the restart-to-spike correlation is immediate.
  • Can suppress buffer pool alerts while Uptime is below a configurable threshold.
  • Exposes AHI efficiency metrics where available to separate buffer pool coldness from hash index coldness.
  • Annotates MySQL restarts on charts automatically.
  • Tracks Threads_running to confirm elevated latency is from read misses, not connection pile-up or lock storms.