MySQL adaptive hash index latch contention: high CPU, low throughput
Burning CPU while barely answering queries: OS user time near 100%, application latency spiking, and Questions flat or falling. Threads_connected is high but Threads_running stays low, and disk I/O is quiet. This pattern matches an Adaptive Hash Index (AHI) latch storm.
InnoDB builds an in-memory hash index over frequently accessed B-tree pages to speed lookups. Since MySQL 5.7, the latch is partitioned into innodb_adaptive_hash_index_parts (default 8). Under high concurrency, especially point lookups on a hot index, threads can collide on the same partition latch and spin instead of executing. CPU saturates while throughput collapses.
The fastest fix is SET GLOBAL innodb_adaptive_hash_index = OFF. It takes effect immediately, drops the entire AHI structure, and stops the spin waits.
What this means
AHI lives in the buffer pool. When a thread looks up a popular index, InnoDB may hash the target B-tree page to skip tree traversal. The btr_search rw-latch protects each partition. If one thread holds the latch exclusively to build or evict a hash entry, concurrent threads spin. On a many-core server with a point-lookup workload, eight partitions can serialize access and create a convoy.
Threads are counted as connected but not running, so Threads_running stays below Threads_connected while CPU climbs. Because the bottleneck is a memory latch, not disk, Innodb_buffer_pool_reads does not spike and I/O utilization remains low. Queries may not appear in the slow query log because they are not waiting on locks or disk; they are stuck in CPU spin loops inside InnoDB.
Common causes
| Cause | What it looks like | First thing to check |
|---|---|---|
| AHI partition collision on a hot index | btr_search waits in SHOW ENGINE INNODB STATUS with spin rounds above 100K; a specific index may be named | SEMAPHORES section for btr_search latch lines |
innodb_adaptive_hash_index_parts too small for core count | High CPU on a many-core host running point lookups with the default partition count | SHOW GLOBAL VARIABLES LIKE 'innodb_adaptive_hash_index_parts' against CPU core count |
| Buffer pool churn invalidating AHI entries | AHI hit ratio flat or dropping alongside high buffer pool reads | INFORMATION_SCHEMA.INNODB_METRICS for adaptive_hash_searches versus adaptive_hash_searches_btree |
| Workload shifted to scans or range lookups | Hash searches below 20% of total lookups; CPU high but latch waits absent | AHI hit ratio from INNODB_METRICS and recent query plan changes |
Quick checks
-- Measure the active thread gap
SHOW GLOBAL STATUS LIKE 'Threads_running';
SHOW GLOBAL STATUS LIKE 'Threads_connected';
-- Sample query throughput twice to compute rate
SHOW GLOBAL STATUS LIKE 'Questions';
-- Look for btr_search latch waits and spin rounds
SHOW ENGINE INNODB STATUS\G
-- Focus on: SEMAPHORES section
-- Compute AHI efficiency
SELECT NAME, COUNT FROM INFORMATION_SCHEMA.INNODB_METRICS
WHERE NAME IN ('adaptive_hash_searches', 'adaptive_hash_searches_btree');
-- Check partition count against CPU cores
SHOW GLOBAL VARIABLES LIKE 'innodb_adaptive_hash_index_parts';
-- Rule out I/O saturation
SHOW GLOBAL STATUS LIKE 'Innodb_buffer_pool_read_requests';
SHOW GLOBAL STATUS LIKE 'Innodb_buffer_pool_reads';
-- Rule out row lock contention (MySQL 8.0)
SELECT COUNT(*) FROM performance_schema.data_lock_waits;
-- MySQL 5.7
SELECT COUNT(*) FROM information_schema.INNODB_LOCK_WAITS;
How to diagnose it
Use this flow to confirm an AHI latch storm before making changes.
flowchart TD
A[High CPU and low throughput] --> B{Threads_running < Threads_connected?}
B -->|Yes| C[Check SEMAPHORES btr_search]
B -->|No| D[Check row locks and I/O]
C --> E{Spin rounds > 100K?}
E -->|Yes| F[AHI latch storm]
E -->|No| G[Other mutex contention]
F --> H[Disable AHI dynamically]- Correlate CPU and throughput. Confirm that OS CPU is above 90% while the MySQL
Questionsrate has dropped sharply from baseline. - Confirm the thread gap. If
Threads_runningis far belowThreads_connected, threads are stuck on internal waits rather than executing. This gap is not unique to AHI, but it rules out simple CPU saturation from productive work. - Inspect
SHOW ENGINE INNODB STATUSand find the SEMAPHORES section. Look for lines containingbtr_search. Spin rounds above 100K are a heuristic indicating threads are burning CPU on this latch. - Check AHI utility. Query
INFORMATION_SCHEMA.INNODB_METRICSforadaptive_hash_searchesandadaptive_hash_searches_btree. If hash searches are less than 20% of total, AHI is not earning its keep. - Compare
innodb_adaptive_hash_index_partsto CPU core count. If the partition count is far below core count, the latch is a serialization bottleneck. - Rule out I/O and row locks. Verify that
Innodb_buffer_pool_readsis not spiking and thatperformance_schema.data_lock_waits(MySQL 8.0) orinformation_schema.INNODB_LOCK_WAITS(MySQL 5.7) is empty. An AHI storm is CPU-bound with quiet disk and no row lock waits.
Metrics and signals to monitor
| Signal | Why it matters | Warning sign |
|---|---|---|
CPU utilization vs Questions rate | Distinguishes spinning from productive load | CPU above 90% with flat or falling Questions |
btr_search semaphore waits | Direct evidence of AHI partition contention | Spin rounds above 100K in SHOW ENGINE INNODB STATUS |
Threads_running / Threads_connected gap | Reveals threads stuck waiting instead of executing | Threads_running sustained below Threads_connected during load |
| AHI hit ratio | Shows whether AHI is worth its memory and risk | adaptive_hash_searches / total lookups below 20% |
Innodb_buffer_pool_read_requests rate | Context for total lookup volume; AHI only helps point lookups | High volume with low AHI utilization suggests mispartitioning or scans |
Fixes
Immediate relief: disable AHI dynamically
Warning: this is disruptive. The AHI structure is dropped immediately and queries revert to B-tree traversal.
Run:
SET GLOBAL innodb_adaptive_hash_index = OFF;
Requires SUPER or SYSTEM_VARIABLES_ADMIN. Threads stop spinning immediately. Expect a brief throughput dip, followed by recovery as convoyed queries resume. If the workload genuinely relied on AHI, you may see a small post-recovery latency increase on hot point lookups. During a storm, this is still usually a net win because the system moves from near-zero throughput to stable execution.
Permanent fix: increase partition count
Set innodb_adaptive_hash_index_parts to roughly twice the CPU core count in the configuration file, then restart the server. Valid range is 1 to 512. More partitions spread latch pressure across separate latches, reducing the chance that two hot threads land on the same partition. Tradeoff: additional partitions consume slightly more memory and can reduce per-partition cache locality. This does not eliminate contention, but it raises the concurrency level at which the storm appears.
Operational choice: leave AHI disabled
If your workload is scan-heavy, write-heavy, or already well served by the buffer pool, disabling AHI permanently removes this failure mode. Some production runbooks disable it by default. Tradeoff: you lose the hash shortcut for hot page lookups. On modern hardware with fast NVMe and large buffer pools, the extra B-tree traversal cost is often imperceptible compared to the risk of a production latch storm.
Fix query patterns
If the workload shifted to range scans or unanchored LIKE queries, AHI cannot help because it only accelerates exact-match page lookups. Optimize queries to use precise point lookups, add covering indexes, or route analytical scans to a replica. If the scan pattern is permanent, disable AHI rather than letting it consume buffer pool memory for no benefit.
Prevention
- Proactively monitor
SHOW ENGINE INNODB STATUSSEMAPHORES forbtr_searchwaits after any major workload change or deployment. - Size
innodb_adaptive_hash_index_partsto twice CPU cores at provisioning time, not after the first incident. - Baseline AHI hit ratio from
INFORMATION_SCHEMA.INNODB_METRICS; if it stays below 50% for a full business cycle, disable AHI permanently in the configuration. - Do not expect AHI to help on workloads dominated by range scans, full table scans, or heavy writes. It is designed for repeated point lookups on a stable working set.
How Netdata helps
- Netdata correlates CPU utilization with MySQL
QuestionsandThreads_runningin the same view, making the spinning-threads gap immediately visible. - Buffer pool and disk I/O charts help rule out storage saturation without switching tools.
- Alerts on
Threads_runningdiverging fromThreads_connectedcan catch latch storms before throughput collapses completely. - Long-term baseline tracking of query throughput highlights subtle slowdowns that precede full CPU saturation.
Related guides
- How MySQL actually works in production: a mental model for operators
- MySQL Aborted_connects and Aborted_clients climbing: diagnosis
- MySQL InnoDB buffer pool hit ratio collapse: the cliff edge
- MySQL innodb_buffer_pool_size tuning: 60-80% of RAM and when that breaks
- MySQL Innodb_buffer_pool_wait_free > 0: buffer pool memory pressure
- MySQL InnoDB checkpoint age: the redo log capacity signal nobody watches
- MySQL connection exhaustion: detection, diagnosis, and prevention
- MySQL innodb_deadlock_detect=OFF: when deadlock detection becomes the bottleneck
- MySQL ERROR 1213: Deadlock found when trying to get lock; try restarting transaction
- MySQL FLUSH TABLES WITH READ LOCK stall: backups that freeze the server
- MySQL slow commits with idle CPU: redo and binlog fsync pressure
- MySQL gap locks and next-key locks: surprising deadlocks under REPEATABLE READ







