MySQL innodb_deadlock_detect=OFF: when deadlock detection becomes the bottleneck
At extreme concurrency, InnoDB’s deadlock detector can become the bottleneck. It traverses the wait-for graph on every lock enqueue. That cost is usually negligible until thousands of concurrent writing transactions hit a small set of hot rows. When traversal itself saturates CPU, throughput collapses even though disks are idle and the buffer pool is warm.
innodb_deadlock_detect=OFF removes that traversal. Instead of instant deadlock detection, InnoDB relies exclusively on innodb_lock_wait_timeout to break cycles. The blocked transaction waits, holding its locks, until the timeout expires. This is not a generic performance tuning knob. It is a narrow, high-risk tradeoff that converts instant rollback into delayed timeout, and it is appropriate only when you have confirmed through testing that deadlock detection is the bottleneck.
What it is and why it matters
innodb_deadlock_detect is a dynamic global boolean variable. It defaults to ON in MySQL 5.7.15+ and MySQL 8.0+. It does not exist in MySQL 5.6 or earlier; attempts to set it there return an unknown variable error.
When ON, every time a transaction requests a row or table lock that cannot be granted immediately, InnoDB runs DeadlockChecker::check_and_resolve. The checker walks the wait-for graph to see whether adding this wait edge creates a cycle. If it finds one, it instantly rolls back the smaller transaction (by row count) and returns ER_LOCK_DEADLOCK to the victim.
When OFF, InnoDB skips that traversal. The transaction is enqueued as a waiter and sleeps until the lock is released or innodb_lock_wait_timeout (default 50 seconds) expires. The victim selection logic is bypassed entirely.
Under normal OLTP concurrency, the CPU cost of detection is negligible. Under extreme contention (thousands of threads, hot-row updates, queue-style access patterns), the cost of the graph walk on every lock enqueue becomes the dominant consumer of CPU time. Historical Bug #49047 demonstrated pathological O(n^2) behavior in this path when many locks existed on a single row. While subsequent fixes improved throughput, the fundamental scaling concern remains: detection cost grows with the depth of the wait-for graph.
How it works
When a lock request is blocked, InnoDB must decide whether the new wait edge creates a deadlock.
With detection enabled, the following occurs:
- InnoDB enters
lock_rec_enqueue_waiting()for row locks orlock_table_enqueue_waiting()for table locks. - It invokes the deadlock checker, which traverses the wait-for list of transactions.
- If the wait-for list exceeds 200 transactions, or the checker must inspect more than 1,000,000 locks owned by transactions on that list, InnoDB treats the condition as a deadlock regardless of the setting. The checking transaction is rolled back.
- If a cycle is found, the cheaper transaction (by estimated row modifications) is selected as the victim and rolled back immediately.
- The surviving transactions continue without waiting.
With detection disabled, steps 2 through 4 are skipped. The blocked transaction is placed on the wait queue and relies entirely on innodb_lock_wait_timeout. When the timeout fires, the waiting statement is rolled back. The transaction continues unless innodb_rollback_on_timeout is enabled.
flowchart TD
A[Lock request blocked] --> B{innodb_deadlock_detect}
B -->|ON| C[Traverse wait-for graph]
C --> D{Deadlock found?}
D -->|Yes| E[Roll back victim immediately]
D -->|No| F[Wait for lock or timeout]
B -->|OFF| F
F --> G{Wait exceeds innodb_lock_wait_timeout?}
G -->|Yes| H[Roll back waiter]
G -->|No| I[Lock granted]Note that innodb_print_all_deadlocks logs deadlocks to the error log independently of the innodb_deadlock_detect setting. If you are evaluating this change, enable it. SHOW ENGINE INNODB STATUS only shows the most recent deadlock, which is insufficient when deadlocks are frequent.
Where it shows up in production
This bottleneck appears suddenly when concurrency crosses a threshold where the graph traversal dominates useful work.
Typical triggering conditions:
- Hot-row contention: Counter increments, inventory decrements, or queue pops that all target the same small set of rows.
- High thread counts: Thousands of concurrent transactions, often from connection pool misconfiguration or thread pool absence.
- CPU saturation with low throughput:
Threads_runningis high, CPU is pegged, butQuestionsrate andInnodb_rows_updatedare flat or falling. Disk I/O and buffer pool hit ratio look healthy.
If you see these symptoms, confirm the diagnosis before changing the setting. Use SHOW ENGINE INNODB STATUS and inspect the TRANSACTIONS section for current locks and waits. Correlate with Innodb_row_lock_waits and Innodb_row_lock_current_waits. If the bottleneck is instead missing indexes causing gap locks, or long transactions holding locks, disabling detection will not fix the root cause. It will only remove the CPU cost of noticing the contention.
Tradeoffs and when to use it
Disabling deadlock detection is appropriate only after controlled load testing on representative traffic proves it is the bottleneck. It is not a substitute for fixing transaction design, shortening transactions, or adding indexes that reduce lock ranges.
When to consider it:
- Load testing shows CPU saturation from deadlock detection itself, and query latency drops meaningfully when detection is disabled.
- The workload has very high concurrency on unavoidable hot rows.
- The application already handles lock wait timeouts gracefully, with retry logic at the statement or transaction level.
- You reduce
innodb_lock_wait_timeoutaggressively; consider setting it to a low value such as 2 seconds so that genuine deadlocks do not hold connections for the default 50 seconds.
When to avoid it:
- The workload has a non-trivial natural deadlock rate. With detection off, deadlocked transactions hold locks for the full timeout instead of resolving instantly. This inflates
Innodb_row_lock_current_waitsand can cascade into connection exhaustion. - The application does not distinguish
ER_LOCK_WAIT_TIMEOUTfromER_LOCK_DEADLOCK. A timeout-based rollback under genuine deadlock conditions behaves differently from instant victim selection. - You have not load-tested for at least two weeks with representative deadlock patterns. One documented production failure involved cascading lock buildup behind blocked transactions that eventually exhausted
max_connectionsand froze the instance.
| Aspect | Detection ON | Detection OFF |
|---|---|---|
| CPU cost per blocked lock | Graph traversal | None |
| Deadlock resolution | Instant victim rollback | Delayed until timeout |
| Connection slot held during deadlock | Brief | Full innodb_lock_wait_timeout |
| Victim selection | Smaller transaction by row count | Waiter on timeout |
Risk of max_connections exhaustion | Low | Elevated if deadlocks are frequent |
A special case is setting innodb_lock_wait_timeout to 0 with detection off. This prevents any waiting, which eliminates deadlocks entirely, but it produces a storm of timeout errors. It is viable only if the application handles every statement failure with an immediate retry and conflicts are genuinely rare.
Signals to watch in production
If you disable deadlock detection, monitoring shifts from watching deadlock frequency to watching lock wait duration and connection saturation.
| Signal | Why it matters | Warning sign |
|---|---|---|
lock_deadlocks rate from INFORMATION_SCHEMA.INNODB_METRICS | Deadlocks still occur even with detection off. A sustained rate means the application is creating cycles faster than timeouts can clear them. | Nonzero rate sustained above your tested baseline. |
Innodb_row_lock_current_waits | Shows transactions waiting right now. With detection off, deadlocked transactions inflate this gauge for the full timeout. | Elevated for longer than before the change, or climbing steadily. |
Threads_connected / max_connections | Blocked transactions consume slots until timeout. Under genuine deadlock, multiple participants sit blocked. | Ratio approaching 0.80 during peak, or sudden spikes correlating with lock waits. |
Questions rate | Validates whether the change actually improved throughput. If detection was the bottleneck, this should rise. | Drop from baseline while Threads_connected remains stable indicates queries are stuck waiting. |
Threads_running | If CPU drops but this stays high, the system is still contention-bound, just not on the detection path. | Sustained above CPU core count without corresponding throughput. |
Error log deadlock output with innodb_print_all_deadlocks=ON | Reveals hidden deadlock storms that cumulative counters smooth over. | Frequent deadlock entries, especially with matching timeout errors in application logs. |
Enable innodb_print_all_deadlocks during any evaluation period. Without it, you are flying blind on the actual deadlock rate.
How Netdata helps
Innodb_row_lock_waitsandInnodb_row_lock_current_waitsfromSHOW GLOBAL STATUShelp correlate throughput drops with lock contention before you change detection settings.- The
lock_deadlockscounter fromINFORMATION_SCHEMA.INNODB_METRICSgives a baseline deadlock frequency so you can measure whether disabling detection changes the effective rate. Threads_connectedandThreads_runningshow whether blocked transactions are piling up and threatening connection exhaustion after the change.- Query latency and
Questionsrate validate the experiment: if latency falls and throughput rises after disabling detection, the deadlock traversal was likely the CPU bottleneck. - MySQL error log monitoring surfaces
innodb_print_all_deadlocksentries, exposing deadlock storms that aggregate status variables miss.
Related guides
- How MySQL actually works in production: a mental model for operators: /guides/mysql/how-mysql-works-in-production/
- MySQL Aborted_connects and Aborted_clients climbing: diagnosis: /guides/mysql/mysql-aborted-connections/
- MySQL connection exhaustion: detection, diagnosis, and prevention: /guides/mysql/mysql-connection-exhaustion/
- MySQL ERROR 1213: Deadlock found when trying to get lock; try restarting transaction: /guides/mysql/mysql-deadlock-found/
- MySQL gap locks and next-key locks: surprising deadlocks under REPEATABLE READ: /guides/mysql/mysql-gap-locks-next-key-locks/
- MySQL ERROR 1205: Lock wait timeout exceeded; try restarting transaction: /guides/mysql/mysql-lock-wait-timeout-exceeded/
- MySQL: got a packet bigger than ‘max_allowed_packet’ bytes - causes and fixes: /guides/mysql/mysql-max-allowed-packet-errors/
- MySQL monitoring checklist: the signals every production instance needs: /guides/mysql/mysql-monitoring-checklist/
- MySQL monitoring maturity model: from survival to expert: /guides/mysql/mysql-monitoring-maturity-model/
- MySQL InnoDB row lock contention: finding who blocks whom: /guides/mysql/mysql-row-lock-contention/
- MySQL Threads_created climbing: thread cache churn and missing pooling: /guides/mysql/mysql-thread-cache-churn/
- MySQL Threads_connected vs Threads_running: which one to actually alert on: /guides/mysql/mysql-threads-connected-vs-threads-running/







