MySQL undo tablespace growing: ibdata1 bloat and undo truncation

Disk space alert: /var/lib/mysql/ibdata1 is 40 GB larger than last week, or the undo_*.ibu files in MySQL 8.0 are consuming terabytes. Table sizes from information_schema do not justify the growth. This is almost always undo log accumulation from a blocked purge thread.

Undo logs let InnoDB reconstruct older row versions for MVCC and transaction rollback. The purge thread deletes them when no active transaction needs them. If one long-running transaction holds a read view, purge stops. History list length then grows linearly with write rate, and the undo tablespace expands to hold the backlog. In MySQL 5.7, that space lands in the system tablespace ibdata1, which never shrinks. In MySQL 8.0, separate undo tablespaces can truncate automatically, but only after the history list drains and the purge thread frees rollback segments. Until then, the disk fills.

What this means

InnoDB’s MVCC keeps old row versions in undo logs. When a transaction modifies a row, the pre-image goes to undo. A running transaction may need an older version for a consistent read, so the purge thread cannot delete undo records newer than the oldest read view.

A long-running transaction, idle or active, pins every subsequent undo record created by the entire workload. The history list length (trx_rseg_history_len) becomes MVCC debt. Queries pay interest: consistent reads traverse longer version chains, and undo tablespace files expand to store retained records.

In MySQL 5.7, undo logs reside in the shared system tablespace ibdata1 by default. That file grows monotonically; deleting data or rolling back transactions does not shrink it. Reclaiming space requires a full logical dump, a clean data directory, and reload.

MySQL 8.0 stores undo in separate tablespace files (undo_001, undo_002, etc.). When the history list drains, automatic truncation reclaims space asynchronously via the purge thread. This is not instantaneous relief.

flowchart TD
    A[Disk alert: ibdata1 or undo files growing] --> B{History list > 100k?}
    B -->|Yes| C[Find oldest transaction in INNODB_TRX]
    B -->|No| D[Check data growth and temp files]
    C --> E{trx_query IS NULL?}
    E -->|Yes| F[Idle connection or leaked transaction]
    E -->|No| G[Long-running query or backup]
    F --> H[Kill thread_id or fix connection pool]
    G --> I[Kill query or let it finish]
    H --> J[History list drains, purge reclaims undo]
    I --> J
    J --> K{MySQL 5.7?}
    K -->|Yes| L[ibdata1 remains bloated, plan logical dump]
    K -->|No| M[Automatic truncation shrinks undo tablespaces async]

Common causes

CauseWhat it looks likeFirst thing to check
Idle transactionINNODB_TRX shows an old trx_started with trx_query IS NULL; often a developer session or ORM leakOldest transaction age and processlist host/user
Long-running reporting query or backupmysqldump --single-transaction or ETL job in REPEATABLE READ holding a read view for hourstrx_query content and connection source
Application connection pool leakFramework begins transactions that are never committed or rolled back before the connection returnsINNODB_TRX count growing steadily with application hostnames
MySQL 5.7 default undo placementibdata1 grows while data files and .ibd sizes stay flat; no undo_* files existSHOW VARIABLES LIKE 'innodb_undo_tablespaces' and version

Quick checks

Run these read-only commands to confirm undo growth and find the blocker.

# History list length (MySQL 8.0)
SELECT COUNT FROM INFORMATION_SCHEMA.INNODB_METRICS WHERE NAME = 'trx_rseg_history_len';
# All versions: parse from SHOW ENGINE INNODB STATUS
SHOW ENGINE INNODB STATUS\G
# Look for: History list length NNNN
# Oldest active transactions
SELECT trx_mysql_thread_id, trx_started,
       TIMESTAMPDIFF(SECOND, trx_started, NOW()) AS age_seconds,
       trx_rows_modified, trx_query
FROM information_schema.INNODB_TRX
ORDER BY trx_started ASC;
# Undo tablespace size on disk
du -sh /var/lib/mysql/ibdata1 /var/lib/mysql/undo_* 2>/dev/null
# Correlate with connection details
SHOW PROCESSLIST;

How to diagnose it

  1. Confirm the growth is undo, not data. Compare .ibd file sizes to the data directory total. If ibdata1 or undo_* grows disproportionately to table sizes, the cause is undo or internal bookkeeping, not user data.
  2. Check the history list length. In MySQL 8.0, query INNODB_METRICS for trx_rseg_history_len. In any version, parse SHOW ENGINE INNODB STATUS for the “History list length” line. A value above 100,000 is concerning; above 1,000,000 indicates a severe blockage.
  3. Identify the blocking transaction. Query information_schema.INNODB_TRX ordered by trx_started. The oldest row holds the read view blocking purge. Note trx_mysql_thread_id and whether trx_query is NULL.
  4. Map the transaction to a connection. Use SHOW PROCESSLIST or performance_schema.threads to find the user, host, and application name for that thread ID. An idle connection with Command = 'Sleep' and a very high Time value confirms a leaked or forgotten transaction.
  5. Determine the MySQL version and undo configuration. Run SELECT VERSION();. If you are on 5.7 and innodb_undo_tablespaces is not configured, the growth is in ibdata1 and will not shrink without a full logical rebuild.
  6. Assess secondary impact. Check if read query latency is rising and Threads_running is increasing. This confirms the history list is degrading performance, not just consuming disk.

Metrics and signals to monitor

SignalWhy it mattersWarning sign
trx_rseg_history_len (history list length)Direct measure of MVCC debt> 100,000 sustained, or any upward trend over hours
Undo tablespace size on disk (ibdata1 or undo_*)Disk space consumption; 5.7 ibdata1 never shrinksGrowing faster than data ingestion, or > 20% of partition
Oldest open transaction age from INNODB_TRXTransaction pinning undo records> 60 seconds in OLTP; > 5 minutes in any workload
Read query latency / Threads_runningSecondary impact from longer version chain traversalLatency rising without query or deployment changes
Innodb_buffer_pool_reads rateMay increase as queries slowCorrelated spike with history list growth

Fixes

Kill the blocking transaction

If the oldest transaction is idle (trx_query IS NULL) or is a non-essential long query, terminate it.

KILL <trx_mysql_thread_id>;

The history list stops growing immediately. The purge thread begins catching up. If the transaction modified many rows, rollback generates I/O and may briefly increase load.

MySQL 5.7: reclaiming ibdata1 space

There is no online shrink operation for ibdata1. Truncation of the shared system tablespace is not supported.

Your options are:

  • Logical dump and reload. Use mysqldump or a logical backup tool to export all data, stop MySQL, remove the data directory, reinitialize, and reload. This is the only way to reduce ibdata1 size. Tradeoff: significant downtime and operator time. This is destructive; verify backups before dropping the data directory.
  • Configure separate undo tablespaces. In MySQL 5.7, innodb_undo_tablespaces is only effective when initializing a new data directory. It prevents future ibdata1 bloat but does not shrink existing undo already inside ibdata1. The separate undo files in 5.7 also grow monotonically.

MySQL 8.0: waiting for automatic truncation

In MySQL 8.0, undo tablespaces are separate and automatic truncation is enabled by default (innodb_undo_log_truncate = ON). Once the history list drains and the purge thread frees rollback segments in a tablespace that exceeds innodb_max_undo_log_size, deactivation and truncation proceed asynchronously.

You cannot force immediate truncation. If disk space is critical, your fastest relief is still killing the offending transaction so purge can run. After the history list returns to normal levels, monitor the undo file sizes on disk. Truncation targets a small initial size, not zero, and may take minutes to hours depending on purge progress.

Application and connection pool fixes

Prevent recurrence by fixing the source:

  • Ensure your ORM or connection pool issues COMMIT or ROLLBACK before returning connections to the pool.
  • Set wait_timeout and interactive_timeout low enough to catch leaked idle connections, but not so low that legitimate long queries are killed.
  • Schedule mysqldump --single-transaction backups during low-traffic windows and monitor their duration.
  • Add application-level query timeouts so reporting queries do not hold read views for hours.

Prevention

  • Monitor history list length continuously. Any sustained growth is an early warning.
  • Alert on open transaction age from INNODB_TRX. In OLTP, anything over 60 seconds warrants investigation.
  • Monitor data directory growth rate independently of table size metrics.
  • In MySQL 5.7, treat ibdata1 growth as a one-way signal. Plan capacity assuming it will not shrink.
  • In MySQL 8.0, do not assume automatic truncation will save you during an active incident. It helps only after purge catches up.

How Netdata helps

  • Correlate disk usage on the MySQL data directory with history list length and transaction age to distinguish undo bloat from data growth.
  • Alert on history list spikes and idle open transactions before disk space becomes critical.
  • Track Innodb_buffer_pool_reads and query latency alongside undo metrics to surface version chain traversal cost.
  • Visualize per-second growth rates for ibdata1 and undo tablespaces to identify bloat onset and correlate with deployment or backup schedules.
  • How MySQL actually works in production: a mental model for operators: /guides/mysql/how-mysql-works-in-production/
  • MySQL Aborted_connects and Aborted_clients climbing: diagnosis: /guides/mysql/mysql-aborted-connections/
  • MySQL adaptive hash index latch contention: high CPU, low throughput: /guides/mysql/mysql-adaptive-hash-index-latch-contention/
  • MySQL InnoDB buffer pool hit ratio collapse: the cliff edge: /guides/mysql/mysql-buffer-pool-hit-ratio-collapse/
  • MySQL slow after restart: buffer pool warm-up and the cold cache: /guides/mysql/mysql-buffer-pool-not-warming-up/
  • MySQL innodb_buffer_pool_size tuning: 60-80% of RAM and when that breaks: /guides/mysql/mysql-buffer-pool-sizing/
  • MySQL Innodb_buffer_pool_wait_free > 0: buffer pool memory pressure: /guides/mysql/mysql-buffer-pool-wait-free/
  • MySQL InnoDB checkpoint age: the redo log capacity signal nobody watches: /guides/mysql/mysql-checkpoint-age-monitoring/
  • MySQL connection exhaustion: detection, diagnosis, and prevention: /guides/mysql/mysql-connection-exhaustion/
  • MySQL innodb_deadlock_detect=OFF: when deadlock detection becomes the bottleneck: /guides/mysql/mysql-deadlock-detect-off-high-concurrency/
  • MySQL ERROR 1213: Deadlock found when trying to get lock; try restarting transaction: /guides/mysql/mysql-deadlock-found/
  • MySQL FLUSH TABLES WITH READ LOCK stall: backups that freeze the server: /guides/mysql/mysql-flush-tables-with-read-lock-stall/