MySQL OOM-killed: buffer pool, per-connection buffers, and the kernel killer

MySQL disappears. Application logs fill with connection timeouts. Uptime resets to near zero. In the kernel log: oom-kill: task mysqld and its anonymous RSS. Orchestrators mark the pod OOMKilled (exit code 137). To the application, this is a crash. To the kernel, mysqld was the largest memory consumer and the system ran out.

OOM kills are gradual, then catastrophic. Memory pressure builds as connections open, temp tables materialize, and dirty pages accumulate. The buffer pool is the obvious consumer, but the killer often enters through the back door: a connection burst multiplies per-thread buffers, a container limit sits too close to the buffer pool size, or Transparent Huge Pages block reclamation. Map all allocators to prevent recurrence.

What this means

MySQL’s memory footprint is not just the InnoDB buffer pool. The total RSS consumed by mysqld is the sum of several distinct allocators that operators often size independently:

  • Buffer pool: Typically configured to 60-80% of available RAM on a dedicated server. Every read and write touches this pool.
  • Per-connection buffers: Each client thread allocates its own thread_stack, sort_buffer_size, join_buffer_size, and net buffers. At max_connections, this is multiplied across every slot. A server with max_connections=500 can allocate hundreds of megabytes in worst-case per-thread overhead alone.
  • Temporary tables: The TempTable storage engine (default in MySQL 8.0+) holds implicit temp tables in RAM up to temptable_max_ram before spilling to disk.
  • Table cache and internals: The table open cache, Adaptive Hash Index, Performance Schema, binary log cache, and redo log buffer all consume additional memory.

In containers and Kubernetes, the cgroup memory limit is checked before the system-wide OOM killer. If a container exceeds its limit, the kernel selects the largest process inside that cgroup. Because mysqld is almost always the largest process, it usually dies first even when the node has free memory. On bare metal or VMs, the system OOM killer fires when the entire host exhausts memory.

flowchart TD
    RAM[Available RAM or cgroup limit] --> BP[Buffer pool]
    RAM --> CONN[Per-connection buffers]
    RAM --> TMP[Temp tables]
    RAM --> META[Table cache + AHI + PS]
    BP --> RSS[mysqld RSS]
    CONN --> RSS
    TMP --> RSS
    META --> RSS
    RSS --> LIMIT{Limit reached?}
    LIMIT -->|cgroup| OOM1[cgroup OOM killer]
    LIMIT -->|system| OOM2[system OOM killer]
    OOM1 --> KILL[Kill mysqld]
    OOM2 --> KILL

Common causes

CauseWhat it looks likeFirst thing to check
Buffer pool oversized for available RAMmysqld is the top RSS consumer; OOM occurs under normal load without a traffic spikeinnodb_buffer_pool_size vs total RAM or container limit
Connection burst with large per-thread buffersThreads_connected spikes and memory climbs linearly; OOM follows a traffic surgemax_connections and the sum of sort_buffer_size, join_buffer_size, and thread_stack
Temp table memory explosionQueries with heavy GROUP BY or ORDER BY create implicit temp tables; memory jumps during batch or analytical queriestemptable_max_ram (MySQL 8.0+) or tmp_table_size and max_heap_table_size
Transparent Huge Pages enabledGradual memory growth over hours; OOM despite no configuration change/sys/kernel/mm/transparent_hugepage/enabled
NUMA imbalance on multi-socket serversMemory exhausted on one NUMA node while others have free capacity; OOM despite aggregate free RAMnumactl --hardware or whether innodb_numa_interleave is enabled
Container memory limit too tightKubernetes marks the pod OOMKilled (exit code 137) while the node has plenty of free memorycgroup memory limit vs calculated MySQL maximum

Quick checks

Run these read-only checks to confirm an OOM kill and size the risk.

# Confirm OOM kill in kernel log
sudo dmesg | grep -i "oom-kill\|Out of memory"

# Distinguish cgroup OOM from system OOM (also try journalctl -k if syslog is unavailable)
sudo grep -i "Memory cgroup out of memory" /var/log/syslog
-- Verify unplanned restart (seconds since start)
SHOW GLOBAL STATUS LIKE 'Uptime';
-- Buffer pool and connection buffer sizing
SHOW GLOBAL VARIABLES LIKE 'innodb_buffer_pool_size';
SHOW GLOBAL VARIABLES LIKE 'max_connections';
SHOW GLOBAL VARIABLES LIKE 'thread_stack';
SHOW GLOBAL VARIABLES LIKE 'sort_buffer_size';
SHOW GLOBAL VARIABLES LIKE 'join_buffer_size';
-- Current connection count and temp table budget
SHOW GLOBAL STATUS LIKE 'Threads_connected';
SHOW GLOBAL VARIABLES LIKE 'temptable_max_ram';
# Check Transparent Huge Pages status
cat /sys/kernel/mm/transparent_hugepage/enabled

# Check container memory limit
# cgroup v1
cat /sys/fs/cgroup/memory/memory.limit_in_bytes
# cgroup v2
cat /sys/fs/cgroup/memory.max

How to diagnose it

  1. Confirm the kill was OOM. Check dmesg for a line containing oom-kill: with task mysqld. In containers, look for Memory cgroup out of memory in syslog or journalctl. Exit code 137 confirms SIGKILL.
  2. Correlate the restart time. SHOW GLOBAL STATUS LIKE 'Uptime' resets to near zero. Cross-reference with the kernel log timestamp to rule out a human-initiated restart.
  3. Calculate worst-case memory. Add innodb_buffer_pool_size to (max_connections multiplied by the sum of thread_stack, sort_buffer_size, and join_buffer_size). Add temp table budgets and overhead for the table cache. Compare this total to available RAM or the container memory limit. Actual RSS is usually lower, but OOM safety requires worst-case planning.
  4. Identify the trigger. Pull Threads_connected history from monitoring. A sharp spike before OOM indicates a connection storm. If connections were flat, look for a new query pattern creating large implicit temp tables.
  5. Check THP. If /sys/kernel/mm/transparent_hugepage/enabled shows [always] or [madvise], allocation stalls and fragmentation may have prevented reclamation.
  6. Check NUMA. On servers larger than 128GB, run numactl --hardware. If memory is heavily allocated on one node and free on another, NUMA imbalance pushed the local node into OOM.
  7. Check container limits. If running in Kubernetes or Docker, compare the pod’s memory limit to the calculated MySQL maximum. cgroup OOM fires before system OOM, so the node may appear healthy while the container is not.

Metrics and signals to monitor

SignalWhy it mattersWarning sign
UptimeOOM kills appear as unplanned restartsDrops below 600 seconds outside maintenance windows
Threads_connectedEach connection holds per-thread memorySustained ratio to max_connections above 0.8
Innodb_buffer_pool_wait_freeQueries stalled waiting for free buffer pool pagesNonzero sustained rate indicates acute memory pressure
Dirty page ratio (Innodb_buffer_pool_pages_dirty / pages_total)Dirty pages consume buffer pool headroomSustained above 75% of total pages
Created_tmp_tables rateImplicit temp tables allocate memory before spillingSudden spike correlating with RSS growth
Threads_runningActive concurrency drives actual memory usageSustained above 4 times CPU cores with climbing RSS

Fixes

Resize the buffer pool

If innodb_buffer_pool_size exceeds 60-80% of available RAM, reduce it. MySQL 5.7.5+ supports dynamic resizing with SET GLOBAL innodb_buffer_pool_size = .... Monitor SHOW STATUS LIKE 'Innodb_buffer_pool_resize_status' during the operation.

Cap connections and per-thread buffers

Lower max_connections to a value your memory budget supports. Reduce sort_buffer_size, join_buffer_size, and thread_stack if they have been raised above defaults. At 500 connections, the worst-case per-thread overhead alone can exceed 640MB.

Limit temp table RAM

In MySQL 8.0+, temptable_max_ram caps the total memory used by the TempTable engine across all queries. Lowering it forces earlier spill to disk. Tradeoff: queries using large implicit temp tables become slower, but the instance survives.

Disable Transparent Huge Pages

echo never > /sys/kernel/mm/transparent_hugepage/enabled

THP causes allocation stalls in MySQL. Most production deployments disable it. This is runtime-only; persist via grub or sysctl to survive reboot.

Address NUMA imbalance

On multi-socket servers, set innodb_numa_interleave=ON (requires restart) or launch mysqld with numactl --interleave=all. This prevents one NUMA node from saturating while others are free.

Add swap as a pressure valve

Without swap, a single memory spike triggers immediate OOM. Configuring a swap file provides headroom. Tradeoff: if buffer pool pages swap, latency degrades. Monitor si and so columns in vmstat.

Adjust container limits

In Kubernetes or Docker, the cgroup memory limit must cover the buffer pool plus per-connection overhead plus temp space and OS overhead. If the pod is OOMKilled while the node has free memory, raise the limit or shrink the buffer pool.

Consider vm.overcommit_memory=2

Setting vm.overcommit_memory=2 disables memory overcommitment. OOM kills from overcommitment stop, but legitimate allocations fail with ENOMEM, which can cause MySQL to abort. This is system-wide. Test thoroughly before applying.

Prevention

  • Size the buffer pool to 60-80% of available RAM only on dedicated servers. If the host runs other services, reduce accordingly and model the worst-case total.
  • Set max_connections based on memory, not just connection demand. Use the formula: buffer pool plus (max connections multiplied by per-thread buffer sum) plus temp table budget.
  • Disable Transparent Huge Pages before production traffic starts.
  • Plan NUMA policy for multi-socket hardware.
  • In containers, set memory limits with headroom for connection spikes and temporary allocations. Do not set the limit equal to the buffer pool size.
  • Monitor Uptime and host memory utilization together. Do not rely on MySQL internal metrics alone to predict OOM.

How Netdata helps

  • Correlates mysql_Uptime drops with system kmsg OOM killer events to distinguish an OOM kill from a planned restart.
  • Tracks mysql_Innodb_buffer_pool_wait_free to detect buffer pool memory pressure before the kernel intervenes.
  • Monitors mysql_Threads_connected against max_connections to identify connection-driven memory spikes.
  • Alerts on host memory utilization with context on whether mysqld is the top consumer.
  • Visualizes mysql_Innodb_buffer_pool_pages_dirty alongside disk write metrics to surface checkpoint pressure that amplifies memory contention.