MySQL FLUSH TABLES WITH READ LOCK stall: backups that freeze the server
Your application suddenly cannot write. Threads_connected climbs toward max_connections, but Questions flatlines. The last change was a backup job that started ten minutes ago. The culprit is almost always FLUSH TABLES WITH READ LOCK (FTWRL), and the damage is caused not by the lock itself but by what happens while the server waits to acquire it.
FTWRL is triggered by mysqldump --master-data, mysqldump --lock-tables, and similar tools that need a consistent logical backup. It attempts to close all open tables and acquire a global read lock. While it waits for a long-running query to finish, new writes are already blocked. The result is a whole-server freeze: the query rate drops to zero, connections pile up, and the application sees cascading timeouts.
flowchart TD
A[Long-running SELECT or open transaction] --> B[FTWRL waits to close tables]
B --> C[Global read lock blocks writes]
C --> D[Queries queue behind lock]
D --> E[Threads_running and Threads_connected rise]
E --> F[Questions rate drops toward zero]
F --> G[Application timeouts and connection pool exhaustion]What this means
When FLUSH TABLES WITH READ LOCK runs, MySQL must close all open table handles. If a long-running SELECT, an uncommitted transaction, or any statement holding a table open is active, FTWRL waits. During that wait, writes are blocked. Once FTWRL succeeds, it holds a global read lock that blocks all writes until UNLOCK TABLES.
The freeze begins before the backup starts. The backup tool may sit waiting for FTWRL to return while the server queues write operations. Because the global read lock is incompatible with any write, every INSERT, UPDATE, DELETE, and DDL operation stalls. Reads that do not require table modifications may still work, but any write-dependent application function stops.
This is different from a metadata lock cascade on a single table. FTWRL affects the entire instance.
Common causes
| Cause | What it looks like | First thing to check |
|---|---|---|
mysqldump --lock-tables or --master-data | Backup cron starts; seconds later, write latency spikes and Questions drops | SHOW FULL PROCESSLIST for the backup connection and the oldest running query |
Long-running reporting SELECT | FTWRL hangs behind a query that has been running for minutes or hours | SHOW FULL PROCESSLIST ordered by Time descending |
| Idle open transaction holding table context | No active query visible, but a transaction started long ago holds locks | SELECT * FROM information_schema.INNODB_TRX ORDER BY trx_started ASC |
| Backup tooling with implicit FTWRL | Third-party backup tools or wrappers that default to table-locking mode | Backup script arguments and environment variables |
Quick checks
These checks are read-only and safe during an incident.
-- Check for the FTWRL session and its runtime
SHOW FULL PROCESSLIST;
-- Find the oldest open transaction that may be blocking table closure
SELECT trx_id, trx_mysql_thread_id, trx_started, trx_query
FROM information_schema.INNODB_TRX
ORDER BY trx_started ASC LIMIT 5;
-- Check if query throughput has collapsed
SHOW GLOBAL STATUS LIKE 'Questions';
SHOW GLOBAL STATUS LIKE 'Threads_running';
SHOW GLOBAL STATUS LIKE 'Threads_connected';
-- Check InnoDB transaction state for long-running blockers
SHOW ENGINE INNODB STATUS\G
Look for a connection running FLUSH TABLES WITH READ LOCK that has been active for longer than a few seconds. Then look for the oldest query or transaction. If the connection is from mysqldump, the User column or the query text in Info usually reveals it. Queries blocked by FTWRL often show Waiting for global read lock or Waiting for table flush in the State column.
How to diagnose it
- Confirm the freeze. Run
SHOW GLOBAL STATUS LIKE 'Questions';twice, five seconds apart. If the counter is flat or barely moving while the application is active, queries are stalling. - Identify FTWRL. Run
SHOW FULL PROCESSLIST. Look for a thread executingFLUSH TABLES WITH READ LOCKwith a highTimevalue. - Find the blocker. Look for the oldest running query or the oldest open transaction in
information_schema.INNODB_TRX. The blocker may be a largeSELECT, an idle transaction, or a slow report. - Verify the source. Check whether the FTWRL connection belongs to a backup user, a cron job, or a deployment script. Cross-reference with your backup schedule.
- Assess impact. Check
Threads_runningandThreads_connected. IfThreads_connectedis climbing towardmax_connections, you have minutes before the server rejects new connections. - Decide. If the blocker is expendable, kill it. If the backup is expendable, kill the FTWRL connection. If neither is safe, you are choosing between a failed backup and a write outage.
Metrics and signals to monitor
| Signal | Why it matters | Warning sign |
|---|---|---|
Questions rate | Direct measure of query throughput | Sustained drop >50% from baseline with active clients |
Threads_running | Active work piling up | Rising while Questions drops indicates queuing |
Threads_connected | Connection pool exhaustion | Approaching max_connections during an FTWRL stall |
INNODB_TRX open transaction age | Identifies the silent blocker | Any transaction older than your backup window |
Metadata lock waits via performance_schema.metadata_locks | Distinguishes metadata locking from row locking | Pending locks on multiple objects during backup |
Innodb_row_lock_current_waits | Rules out row-level contention | Near zero during an FTWRL stall confirms global lock |
Fixes
Warning: KILL is disruptive. The targeted query, transaction, or connection fails immediately and must be retried or reconnected.
Kill the blocking query or transaction
If the blocker is a non-essential report or an idle session, killing it allows FTWRL to acquire its lock immediately. The backup proceeds and the global freeze ends.
-- Identify the thread ID of the blocker from SHOW FULL PROCESSLIST
KILL <blocker_thread_id>;
Tradeoff: the killed query fails and must be retried. If the blocker is a replication thread or a critical ETL job, do not kill it.
Kill the FTWRL session
If the backup is not time-critical, kill the FTWRL session. This releases the global read lock and restores writes fastest.
-- Identify the thread ID running FLUSH TABLES WITH READ LOCK
KILL <ftwrl_thread_id>;
Tradeoff: the backup is inconsistent and must be restarted.
Switch to –single-transaction
For InnoDB-only servers, replace --lock-tables or --master-data with --single-transaction where appropriate. This starts a consistent snapshot without a global read lock.
# Instead of:
mysqldump --lock-tables db_name > backup.sql
# Use:
mysqldump --single-transaction db_name > backup.sql
Tradeoff: --single-transaction starts a transaction in REPEATABLE READ and holds an MVCC read view for the duration of the dump. This can prevent purge of old undo log records and cause history list length growth or undo tablespace bloat. It does not prevent concurrent DDL, and any non-InnoDB tables in the dump may still be inconsistent.
Use physical or storage-level backups
For large datasets, avoid logical dumps entirely. Use Percona XtraBackup, MySQL Enterprise Backup, or storage snapshots. These tools copy data files directly and do not require FTWRL for the entire backup duration.
Tradeoff: requires additional tooling, disk space, and recovery procedures. Percona XtraBackup and MySQL Enterprise Backup still briefly execute FTWRL at the end of the backup to snapshot non-InnoDB tables or capture binlog coordinates, but the duration is typically seconds rather than the full backup window. Snapshot consistency depends on filesystem or storage engine support.
Prevention
- Audit backup commands. Search all cron jobs, Ansible playbooks, and Puppet manifests for
mysqldump --lock-tablesor--master-data. Replace with--single-transactionwhere appropriate. - Run heavy reads on replicas. Direct long-running reports and analytics queries to a replica so they cannot block FTWRL on the primary.
- Back up from a replica. If you require FTWRL for consistency, run the backup against a replica rather than the primary. Monitor replication lag; if the replica is lagging, the backup may still stall replica writes, but the primary remains unaffected.
- Monitor open transaction age. Alert on any transaction in
INNODB_TRXolder than your backup window. - Set query timeouts. Use
max_execution_timeto cap ad-hocSELECTduration and prevent runaway queries from blocking backups. - Document backup requirements. Ensure every backup job includes a comment explaining why it uses or avoids table locks.
How Netdata helps
Netdata correlates the signals that reveal an FTWRL stall:
Questionsrate: A sudden drop to near zero whileThreads_connectedremains high is the signature of a global lock freeze.Threads_running: Spikes during FTWRL show queries piling up behind the lock, even if CPU and disk are idle.Threads_connected: Approachingmax_connectionsduring a stall provides early warning before connection refusal.- Metadata lock monitoring: Surfaces metadata lock buildup that precedes full cascades.
- Anomaly detection: Flags unusual throughput drops during scheduled backup windows.
Related guides
- How MySQL actually works in production: a mental model for operators
- MySQL Aborted_connects and Aborted_clients climbing: diagnosis
- MySQL connection exhaustion: detection, diagnosis, and prevention
- MySQL innodb_deadlock_detect=OFF: when deadlock detection becomes the bottleneck
- MySQL ERROR 1213: Deadlock found when trying to get lock; try restarting transaction
- MySQL gap locks and next-key locks: surprising deadlocks under REPEATABLE READ
- MySQL ERROR 1205: Lock wait timeout exceeded; try restarting transaction
- MySQL long-running transactions: detecting and killing the silent blocker
- MySQL: got a packet bigger than ‘max_allowed_packet’ bytes - causes and fixes
- MySQL metadata lock cascade: how one ALTER TABLE freezes a whole table
- MySQL monitoring checklist: the signals every production instance needs
- MySQL monitoring maturity model: from survival to expert







