$ guides / mysql / mysql-binary-log-disk-full ▌

Operations Guides

MySQL binary logs filling the disk: expiry, lagging replicas, and purge

You get a disk-full alert on the MySQL primary. df -h shows the data partition at 95%, and du points to /var/lib/mysql/binlog.* consuming hundreds of gigabytes. Writes are about to fail.

Binary logs are append-only. MySQL rotates to a new file at max_binlog_size (default 1 GB), but rotation does not delete old files. Deletion only happens via automatic expiry or PURGE BINARY LOGS. MySQL refuses to delete any binlog file that a connected replica has not yet consumed. In MySQL 5.7, the default expiry is never (expire_logs_days = 0). In 8.0, the default is 30 days, but automatic purge cannot remove files that a connected replica still needs. A replica lagging past the expiry window therefore causes unbounded growth.

If you are responding to an incident, confirm whether binlogs can be safely purged without breaking replication, then free space. If you are reading to prevent the next incident, configure explicit expiry and monitor the signals that predict exhaustion.

What this means

The binary log records every data-modifying event on the primary. Replicas pull events via the I/O thread and apply them. MySQL’s purge logic is conservative: it will not remove a binlog file still needed by any connected replica. A single lagging replica can hold the entire retention window hostage. The primary generates new binlog files proportional to write workload. Without purge, disk usage grows linearly.

When the binlog partition fills, the primary cannot write new events and commits stall. On replicas, a full relay log partition stops the I/O thread, causing replication lag to grow. If the primary purges events before the replica fetches them, replication breaks irrecoverably.

flowchart TD
    A[Disk alert on primary] --> B[SHOW BINARY LOGS]
    B --> C{Binlogs dominant?}
    C -->|No| D[Check ibdata1, tmpdir, undo]
    C -->|Yes| E[Check expiry config]
    E --> F{Expiry set?}
    F -->|No| G[Configure expiry]
    F -->|Yes| H[Check replica status]
    H --> I{Replica blocking?}
    I -->|Yes| J[Fix or remove replica]
    I -->|No| K[Check for bulk loads]

Common causes

Cause	What it looks like	First thing to check
No expiry configured	MySQL 5.7 with `expire_logs_days = 0`; hundreds of binlog files	`SHOW GLOBAL VARIABLES LIKE 'expire_logs%'`
Lagging replica	`Seconds_Behind_Source` growing or NULL; `Relay_Log_Space` accumulating	`SHOW REPLICA STATUS\G` on every replica
Large transactions spilling to disk	Sudden binlog growth after a bulk load; `Binlog_cache_disk_use` increasing	`SHOW GLOBAL STATUS LIKE 'Binlog_cache_%'`
Delayed replica	`SQL_Delay` > 0; replica intentionally lags by hours or days	`SHOW REPLICA STATUS\G` and check `SQL_Delay`

Quick checks

# Check filesystem utilization for the MySQL data directory
df -h /var/lib/mysql

-- Sum binary log sizes and count files
SHOW BINARY LOGS;

-- Check automatic expiry configuration for your version
SHOW GLOBAL VARIABLES LIKE 'binlog_expire_logs_seconds';
SHOW GLOBAL VARIABLES LIKE 'expire_logs_days';

-- Check replication thread state and lag (8.0.22+)
SHOW REPLICA STATUS\G
-- For MySQL 5.7: SHOW SLAVE STATUS\G

-- Check if large transactions are spilling to disk
SHOW GLOBAL STATUS LIKE 'Binlog_cache_%';

# Check current binlog file sizes directly on disk
du -sh /var/lib/mysql/binlog.*

How to diagnose it

Confirm binlogs are the primary disk consumer. Run SHOW MASTER STATUS; to identify the current file, then SHOW BINARY LOGS; to list all files and sum the File_size column. If binlogs are not the majority of consumed space, investigate ibdata1, undo tablespaces, or temp table spills instead.
Verify expiry configuration. On MySQL 8.0, binlog_expire_logs_seconds should match your recovery window. On 5.7, expire_logs_days defaults to 0, which means logs accumulate forever.
Inspect every replica. On each replica, run SHOW REPLICA STATUS\G (or SHOW SLAVE STATUS\G on 5.7). Check Replica_IO_Running, Replica_SQL_Running, Seconds_Behind_Source, and Master_Log_File. A thread not Yes, or lag that is growing, means that replica is blocking purge.
Determine if growth is organic or from a recent burst. Compare the current SHOW BINARY LOGS file count to yesterday’s baseline. A sudden spike after a bulk LOAD DATA or large UPDATE indicates a one-time event. Steady linear growth indicates missing expiry or persistent lag.
Calculate runway before disk full. Measure daily growth rate of the binlog directory and divide remaining free space by that rate. If the volume also holds data files, redo logs, or temp tables, leave at least 30% headroom.

Metrics and signals to monitor

Signal	Why it matters	Warning sign
Binary log partition utilization	Direct measure of space exhaustion	> 70% of partition capacity
`Seconds_Behind_Source`	Lagging replicas block automatic purge because the primary must retain logs until the replica reads them	> 300 seconds and growing
Oldest binlog age vs expiry window	Confirms automatic purge is actually working and not blocked by a replica	Oldest file age > expiry setting + 1 day
`Binlog_cache_disk_use` rate	Large transactions accelerate space consumption and often replicate slower	Sharp increase after bulk operations
`Relay_Log_Space` on replicas	Predicts cascading replica disk failure; a full replica stops replication and falls further behind	Growing continuously with lag

Fixes

Configure automatic expiry

For MySQL 8.0, set binlog_expire_logs_seconds to your recovery window. For example, seven days is 604800. For MySQL 5.7, set expire_logs_days explicitly; the default of 0 means logs never expire. Both are dynamic variables: apply with SET GLOBAL and persist in the configuration file. Tradeoff: a shorter window reduces disk usage but narrows the recovery window after a backup.

Safely purge binary logs manually

If disk space is critical and automatic expiry is not removing files fast enough, use PURGE BINARY LOGS BEFORE 'YYYY-MM-DD hh:mm:ss';. You can also use PURGE BINARY LOGS TO 'binlog.000999'; to avoid timestamp ambiguity. Before running either, verify every replica has consumed events past the cutoff. On each replica, run SHOW REPLICA STATUS\G (or SHOW SLAVE STATUS\G on 5.7) and confirm Master_Log_File is past the file you intend to purge. If a replica is stopped, do not purge past its last read position unless you plan to rebuild that replica.

Warning: do not rm binlog files from the shell. Removing files directly orphans entries in the index file and corrupts MySQL’s log state.

Address replication lag

If a replica is lagging because of apply bottlenecks, the primary retains binlogs until the replica catches up. Check Seconds_Behind_Source, Relay_Log_Space, and whether the replica’s SQL thread is applying events slower than the I/O thread fetches them. For replicas configured with MASTER_DELAY, the delay window directly extends binlog retention on the primary. Size the binlog partition to tolerate the delay window plus normal growth, or reduce the delay.

Reduce transaction size

If Binlog_cache_disk_use is increasing, transactions are spilling from memory to disk before being written to the binlog. Break bulk operations into smaller commits. This reduces the per-transaction binlog footprint and often improves replication apply performance on the replica side.

Prevention

Set explicit expiry on every primary. Do not rely on defaults, especially on MySQL 5.7.
Monitor binlog partition utilization and alert at 70%, not 90%.
Monitor replica lag with a heartbeat table or GTID set comparison. Seconds_Behind_Source is unreliable for critical decisions.
Size the binlog partition with at least 30% free headroom if replication lag is common.
Track binlog growth rate daily. Bulk loads can spike growth by an order of magnitude.

How Netdata helps

Disk utilization alerts per mount point catch binlog partition growth before it becomes critical.
MySQL collector exposes Binlog_cache_disk_use and Binlog_cache_use, letting you correlate sudden binlog spikes with large transactions.
Replication lag monitoring per replica identifies which downstream host is blocking purge on the primary.
Long-term retention of binlog directory growth rate makes runway estimation automatic.

How MySQL actually works in production: a mental model for operators: /guides/mysql/how-mysql-works-in-production/
MySQL Aborted_connects and Aborted_clients climbing: diagnosis: /guides/mysql/mysql-aborted-connections/
MySQL adaptive hash index latch contention: high CPU, low throughput: /guides/mysql/mysql-adaptive-hash-index-latch-contention/
MySQL InnoDB buffer pool hit ratio collapse: the cliff edge: /guides/mysql/mysql-buffer-pool-hit-ratio-collapse/
MySQL slow after restart: buffer pool warm-up and the cold cache: /guides/mysql/mysql-buffer-pool-not-warming-up/
MySQL innodb_buffer_pool_size tuning: 60-80% of RAM and when that breaks: /guides/mysql/mysql-buffer-pool-sizing/
MySQL Innodb_buffer_pool_wait_free > 0: buffer pool memory pressure: /guides/mysql/mysql-buffer-pool-wait-free/
MySQL InnoDB checkpoint age: the redo log capacity signal nobody watches: /guides/mysql/mysql-checkpoint-age-monitoring/
MySQL connection exhaustion: detection, diagnosis, and prevention: /guides/mysql/mysql-connection-exhaustion/
MySQL innodb_deadlock_detect=OFF: when deadlock detection becomes the bottleneck: /guides/mysql/mysql-deadlock-detect-off-high-concurrency/
MySQL ERROR 1213: Deadlock found when trying to get lock; try restarting transaction: /guides/mysql/mysql-deadlock-found/
MySQL FLUSH TABLES WITH READ LOCK stall: backups that freeze the server: /guides/mysql/mysql-flush-tables-with-read-lock-stall/

The Netdata solution

MySQL monitoring with Netdata

Netdata monitors MySQL and MariaDB with per-second metrics and ML anomaly detection. Track connection usage, query throughput, slow queries, redo-log pressure, and replication lag alongside the host and storage signals that explain them.

See MySQL monitoring → Start monitoring free

MySQL binary logs filling the disk: expiry, lagging replicas, and purge

MySQL binary logs filling the disk: expiry, lagging replicas, and purge

What this means

Common causes

Quick checks

How to diagnose it

Metrics and signals to monitor

Fixes

Configure automatic expiry

Safely purge binary logs manually

Address replication lag

Reduce transaction size

Prevention

How Netdata helps

Related guides

MySQL monitoring with Netdata