MySQL tablespace fragmentation: reclaiming space with OPTIMIZE TABLE

InnoDB tablespaces fragment after bulk deletes, updates, and purge cycles. Logical data volume shrinks, but the on-disk file retains allocated extents because InnoDB returns freed pages to its internal free lists rather than the OS. For file-per-table .ibd tables, that unused space is visible as DATA_FREE in information_schema.TABLES and can be reclaimed with OPTIMIZE TABLE.

Tables inside the shared system tablespace ibdata1 do not shrink at the OS level. Freed space returns to InnoDB’s internal free lists, but ibdata1 never releases extents back to the filesystem. If the goal is reclaiming disk space on a full volume, OPTIMIZE TABLE against tables in ibdata1 is ineffective. The rest of this guide covers identifying reclaimable tables, running the rebuild safely, and verifying the result.

What fragmentation looks like in InnoDB

InnoDB stores rows in 16 KB pages grouped into extents. Random deletes leave gaps inside pages and can leave entire extents unused. DATA_FREE in information_schema.TABLES reports allocated-but-unused bytes for each table. A large DATA_FREE relative to DATA_LENGTH + INDEX_LENGTH is the canonical signal of fragmentation after a bulk purge.

Some free space is normal. InnoDB leaves headroom inside pages to reduce page splits during updates, so a small DATA_FREE is not a problem by itself. Target tables where DATA_FREE is a significant fraction of the total size, typically after large archival deletes or queue purges.

DATA_FREE captures unused extents, not microscopic holes inside partially filled pages. If you need byte-level insight, tools like innodb_ruby can visualize page fill rates, but for operational cleanup DATA_FREE is the right starting signal.

Prerequisites and constraints

RequirementWhy it matters
innodb_file_per_table = ONOnly file-per-table .ibd files shrink on disk. Tables in ibdata1 never reduce the OS file size.
Free temp disk >= current .ibd sizeOPTIMIZE TABLE builds a new copy of the table before dropping the old one.
MySQL 5.6.17+ for online DDLEarlier versions lock the table for the entire rebuild. Modern versions hold an exclusive lock only briefly at the start and end.
Tables with FULLTEXT indexes use offline copyOnline DDL is not supported for full-text index rebuilds; expect a longer exclusive lock.

If innodb_file_per_table is currently OFF, newly created tables will use file-per-table tablespaces after you enable it, but existing tables are not migrated automatically. Tables already inside ibdata1 stay there until you logically dump and reload them.

Identify fragmented tables

Use information_schema.TABLES to find InnoDB tables with significant DATA_FREE. Filter by engine and a minimum free-space threshold to avoid noise from tiny tables.

-- Find InnoDB tables with large DATA_FREE
SELECT
  TABLE_SCHEMA,
  TABLE_NAME,
  ROUND((DATA_LENGTH + INDEX_LENGTH) / 1024 / 1024, 2) AS logical_mb,
  ROUND(DATA_FREE / 1024 / 1024, 2) AS free_mb,
  ROUND(
    DATA_FREE / NULLIF(DATA_LENGTH + INDEX_LENGTH + DATA_FREE, 0) * 100,
    2
  ) AS frag_pct
FROM information_schema.TABLES
WHERE ENGINE = 'InnoDB'
  AND DATA_FREE > 104857600
ORDER BY DATA_FREE DESC;

Cross-check the largest candidates on the filesystem:

# Check physical .ibd size for a candidate table
ls -lh /var/lib/mysql/db_name/table_name.ibd

# Check free space on the data volume
df -h /var/lib/mysql

If a table has no corresponding .ibd file under the database directory, it lives in ibdata1 and cannot be shrunk with OPTIMIZE TABLE.

flowchart TD
    A[High DATA_FREE on InnoDB table] --> B{File-per-table .ibd?}
    B -->|No| C[Space stays inside ibdata1. OPTIMIZE does not shrink OS file.]
    B -->|Yes| D{Enough temp disk?}
    D -->|No| E[Free disk or postpone.]
    D -->|Yes| F[Run OPTIMIZE TABLE or ALTER TABLE ... ENGINE=InnoDB]
    F --> G[Verify DATA_FREE dropped and .ibd shrank]

Reclaim space safely

For InnoDB, OPTIMIZE TABLE maps to ALTER TABLE ... FORCE followed by ANALYZE TABLE. The server prints a message that says the table does not support optimize and that it is doing a recreate plus analyze instead. This is expected.

  1. Confirm the table is file-per-table. Check the database directory for a standalone .ibd file. Even with innodb_file_per_table = ON, a table may reside in ibdata1 if it was created inside the system tablespace explicitly. If there is no .ibd, the table cannot be shrunk with OPTIMIZE TABLE.

  2. Confirm disk headroom. The rebuild creates a temporary copy of the table. You need at least the current .ibd size in free space on the data volume, or on the volume that hosts temporary files during the rebuild.

    # Current table size and volume free space
    du -sh /var/lib/mysql/db_name/table_name.ibd
    df -h /var/lib/mysql
    
  3. Run the rebuild. Use OPTIMIZE TABLE for routine maintenance, or ALTER TABLE ... ENGINE=InnoDB if you want the same operation without the automatic statistics update.

    OPTIMIZE TABLE db_name.table_name;
    

    Expected output for InnoDB:

    | Table              | Op       | Msg_type | Msg_text                                                          |
    |--------------------|----------|----------|-------------------------------------------------------------------|
    | db_name.table_name | optimize | note     | Table does not support optimize, doing recreate + analyze instead |
    | db_name.table_name | optimize | status   | OK                                                                |
    
  4. Monitor the server during the rebuild. Online DDL allows concurrent DML for most tables, but it still generates redo log writes, dirty pages, and I/O. Watch for checkpoint age climbing toward the redo log capacity, metadata lock waits, and disk saturation.

  5. For full-text tables, schedule a maintenance window. OPTIMIZE TABLE on a table with a FULLTEXT index falls back to the table-copy method, which holds an exclusive lock for most of the operation.

  6. If you only need fresh statistics, stop at ANALYZE TABLE. ANALYZE TABLE updates the optimizer’s row counts and index cardinality estimates but does not copy the table or reclaim disk space.

    ANALYZE TABLE db_name.table_name;
    

Verify the result

After the rebuild finishes, confirm both the logical and physical shrink:

-- Check that DATA_FREE dropped
SELECT
  ROUND((DATA_LENGTH + INDEX_LENGTH) / 1024 / 1024, 2) AS logical_mb,
  ROUND(DATA_FREE / 1024 / 1024, 2) AS free_mb
FROM information_schema.TABLES
WHERE TABLE_SCHEMA = 'db_name' AND TABLE_NAME = 'table_name';
# Confirm the .ibd file shrank
ls -lh /var/lib/mysql/db_name/table_name.ibd

DATA_FREE may not drop to exactly zero because InnoDB keeps normal page headroom, but a large drop alongside a smaller .ibd confirms the reclaim worked.

Common pitfalls

  • Optimizing tables in ibdata1. The operation will rebuild the table and compact rows, but ibdata1 will not return space to the OS. The only remedy for a bloated ibdata1 is logical dump and reload with innodb_file_per_table=ON set before restore.
  • Underestimating temp disk. The new copy exists alongside the old one until the final atomic swap. If the data volume is nearly full, the rebuild can fail or trigger a disk-full incident.
  • Running on full-text tables during peak traffic. The offline copy locks the table exclusively. Treat it like a schema migration.
  • Confusing ANALYZE TABLE with OPTIMIZE TABLE. ANALYZE TABLE only updates statistics. It is useful after bulk loads or when query plans regress, but it does not shrink the physical file.
  • Ignoring replicas. The DDL replicates. Replicas need similar temp disk and will experience similar I/O spikes. Schedule accordingly on replicas serving reads.
  • TRUNCATE TABLE is faster for total cleanup. If you genuinely want to remove all rows and reclaim all space, TRUNCATE TABLE drops and recreates the .ibd file instantly. Do not use it if you need to keep any rows.

Signals to monitor during and after

SignalWhy it mattersWarning sign
DATA_FREE / (DATA_LENGTH + INDEX_LENGTH)Fragmentation ratio> 20-30% after bulk deletes
.ibd file size vs logical sizePhysical bloatFile much larger than DATA_LENGTH + INDEX_LENGTH
Disk free % on data and temp volumesPrevents rebuild failure< 30% before running OPTIMIZE TABLE
Checkpoint age / redo log capacityDDL generates redo; stalls writes if filled> 75% during rebuild
Threads_running and metadata locksConcurrency impactQueuing or “Waiting for table metadata lock”
Innodb_buffer_pool_wait_freeMemory pressure from the rebuildNonzero sustained rate

How Netdata helps

Netdata’s MySQL collector surfaces the signals needed to judge fragmentation and rebuild safety:

  • Correlate table DATA_FREE with DATA_LENGTH to spot tables that grew lopsided after a purge.
  • Watch disk usage on the data and temporary volumes before scheduling OPTIMIZE TABLE.
  • Monitor InnoDB checkpoint age and redo log usage during the rebuild to catch checkpoint pressure before it stalls writes.
  • Track thread concurrency and metadata-lock waits to confirm the table stayed online and DML did not pile up.
  • Watch buffer pool wait events and hit ratio to detect whether the rebuild pushed memory pressure high enough to affect other queries.