PostgreSQL out of memory: OOM killer, shared_buffers, and work_mem
Your PostgreSQL primary restarts without warning, or individual backends vanish from the process list. The kernel log shows Out of Memory: Killed process 12345 (postgres). Existing connections may survive, but new connections fail until the postmaster recovers. This is a memory accounting mismatch between Linux overcommit, PostgreSQL shared and private memory allocation, and how you size shared_buffers and work_mem.
Inside Kubernetes or containers, the symptom is identical but the mechanism differs: cgroup v2 memory.max triggers an immediate SIGKILL with no ENOMEM grace period. In both cases, stop guessing at memory limits and start budgeting.
What this means
PostgreSQL splits memory into shared and private. Shared memory, dominated by shared_buffers, is allocated once at startup and mapped into every backend. Private memory is per-process: each backend consumes work_mem for every sort, hash join, and hash aggregate it runs concurrently. Autovacuum workers use maintenance_work_mem each, unless autovacuum_work_mem is set. Temp buffers and prepared-transaction overhead add per-connection weight.
When shared memory plus peak private memory exceeds what the kernel grants, the OOM killer ranks processes by oom_score and sends SIGKILL. If the postmaster dies, the cluster restarts. If a backend dies, the application sees a dropped connection.
flowchart TD
A[Total RAM + Swap] --> B[shared_buffers
allocated at startup]
A --> C[Per-backend private memory]
C --> D[work_mem x concurrent operators]
C --> E[temp_buffers per connection]
A --> F[Autovacuum workers]
F --> G[maintenance_work_mem
or autovacuum_work_mem]
B --> H{Sum exceeds
kernel limit?}
D --> H
E --> H
G --> H
H -->|Yes| I[OOM killer targets
highest oom_score]
H -->|No| J[System stable]Common causes
| Cause | What it looks like | First thing to check |
|---|---|---|
shared_buffers oversubscribed | OOM soon after startup or during daily peaks; postmaster killed | shared_buffers vs total RAM minus headroom for backends |
work_mem multiplied by concurrent operators | Sudden backend death during large sorts, hash joins, or parallel queries; “Cannot allocate memory” in logs | EXPLAIN output for Memory nodes; pg_stat_statements around incident time |
| Connection storm without pooling | Gradual RSS growth across many backends; system-wide OOM | pg_stat_activity count vs max_connections; temp_buffers usage |
| Autovacuum worker pile-up | Memory pressure during heavy churn table maintenance | autovacuum_max_workers x maintenance_work_mem or autovacuum_work_mem |
| Container cgroup hard limit | Immediate SIGKILL at exact memory limit; host dmesg silent | Container memory limit and cgroup memory.max vs PostgreSQL budget |
Quick checks
# Confirm the Linux OOM killer fired
dmesg -T | grep -i "killed process.*postgres"
# Inspect PostgreSQL memory parameters
psql -c "SHOW shared_buffers; SHOW work_mem; SHOW maintenance_work_mem;"
psql -c "SHOW max_connections; SHOW autovacuum_max_workers; SHOW temp_buffers;"
# Count current backends and sum RSS
psql -c "SELECT state, COUNT(*) FROM pg_stat_activity GROUP BY state;"
ps -o pid,rss,comm -p $(pgrep postgres) | awk 'NR>1 {sum+=$2} END {print "Total RSS KB:", sum}'
# Check kernel overcommit behavior
sysctl vm.overcommit_memory
# Check postmaster OOM score adjustment
cat /proc/$(pgrep -o postgres)/oom_score_adj
How to diagnose it
- Confirm OOM vs ENOMEM. Check
dmesg,journalctl -k, or container status forOOMKilled. PostgreSQL logs may showFATAL: out of memoryor the connection may simply drop if SIGKILL arrived first. - Build a memory ceiling. Start with
shared_buffers. Addautovacuum_max_workers x COALESCE(autovacuum_work_mem, maintenance_work_mem)for worst-case autovacuum. Add a realistic allowance for concurrent query operators (work_memeach) and per-connection overhead. The total must leave headroom for the OS page cache and other processes; if your PostgreSQL budget exceeds ~80% of RAM, you are overcommitted. - Find the query. Query
pg_stat_statementsfor entries with highmean_exec_timeorstddev_exec_timearound the incident window. RunEXPLAIN (ANALYZE, BUFFERS)on suspects and inspectMemory:lines for hash or sort nodes. Warning:EXPLAIN ANALYZEexecutes the query. Run it during low traffic or against a restored backup. - Count the multiplier.
work_memis per operator, not per query. A complex analytical query with four hash joins and two sorts uses six timeswork_mem(orwork_mem x hash_mem_multiplierfor hashes on PostgreSQL 13+), multiplied again by anyparallel_workers. This can exceed a naiveconnections x work_memestimate by an order of magnitude. - Inspect the container boundary. If the workload runs under cgroup v2,
memory.maxis a hard ceiling that kills the highest-scoring process immediately when breached. There is no ENOMEM equivalent inside the cgroup. Checkcat /sys/fs/cgroup/memory.maxor equivalent. - Check autovacuum pressure. During high churn, multiple autovacuum workers can spawn simultaneously. If
autovacuum_work_memis unset, each worker inheritsmaintenance_work_mem, which defaults to 64 MB. Four workers equal 256 MB of concurrent allocation.
Metrics and signals to monitor
| Signal | Why it matters | Warning sign |
|---|---|---|
| System available memory | OOM fires when RAM + swap are exhausted | Available < 10% of total RAM |
| PostgreSQL backend RSS | Private memory growth per process | Backends growing well beyond idle baselines |
| Active connections | Multiplier for private memory and temp space | Active count > 70% of max_connections without a pooler |
pg_stat_database.temp_bytes | Spills indicate heavy sort/hash demand or undersized work_mem | Sustained rapid growth |
pg_stat_statements stddev time | Reveals queries with unstable memory consumption | stddev_exec_time spikes on queries with hash or sort nodes |
| Autovacuum active workers | Each worker holds maintenance memory | Multiple workers active during memory pressure windows |
Fixes
Bare metal and VMs: use strict overcommit and protect the postmaster
Set vm.overcommit_memory = 2 and size vm.overcommit_ratio (or vm.overcommit_kbytes) so the commit limit covers all system allocations. This forces the kernel to return ENOMEM instead of sending SIGKILL. The tradeoff is that you must account for every committed allocation; if the limit is too low, legitimate requests fail system-wide. PostgreSQL backends can then fail the specific query rather than taking down the postmaster.
Protect the postmaster by setting OOMScoreAdjust=-1000 in the systemd unit. Alternatively, export PG_OOM_ADJUST_FILE=/proc/self/oom_score_adj and PG_OOM_ADJUST_VALUE=0 before starting postgres so child backends revert to a neutral score while the postmaster stays immune. The tradeoff is that the OOM killer selects another process, which may still disrupt a running query.
Right-size shared_buffers
If shared_buffers exceeds ~40% of RAM, reduce it. For OLTP, start at 25% of system memory. For OLAP workloads, keep it at or below 25% because the OS page cache handles sequential scans more efficiently. The tradeoff is slightly higher latency for hot pages that no longer fit in PostgreSQL’s dedicated cache, but this is preferable to an OOM-induced restart.
Cap per-backend memory without crippling queries
Lower work_mem globally to a conservative 4-16 MB. For specific heavy queries, raise it with SET LOCAL work_mem = '64MB' inside the transaction instead of inflating the global default. On PostgreSQL 13 and newer, prefer raising hash_mem_multiplier over raising work_mem if hash operations are spilling. This isolates extra allocation to hash workloads without inflating sort budgets.
Set autovacuum_work_mem independently from maintenance_work_mem so vacuum workers do not each inherit a large global value. The tradeoff is that aggressive vacuum on very large tables may take longer with a lower memory cap, but this prevents background maintenance from triggering system-wide OOM.
Stop connection bloat
Deploy PgBouncer in transaction mode and lower max_connections to roughly CPU count x 2-3, plus replication and superuser reservations. Do not raise max_connections to absorb connection storms; each backend increases private memory pressure. The tradeoff is that session-scoped features such as advisory locks, temporary tables, and prepared statements across transactions require session-mode pooling or direct connections.
Container and Kubernetes specifics
Accept that vm.overcommit_memory = 2 is evaluated at the host level and does not prevent cgroup-level OOM kills. If your container orchestrator supports node-level swap, enable it. Swap extends the memory envelope and defers OOM kills until both RAM and swap are exhausted.
Increase the container memory limit to at least your calculated PostgreSQL budget plus 15% headroom for the operating system and monitoring agents. The tradeoff is higher node utilization, but this is cheaper than an unplanned restart.
Prevention
- Calculate the memory budget before deploying configuration changes. Keep the sum of
shared_buffers, concurrent operator memory, and autovacuum worker memory below ~80% of available RAM. - Monitor
pg_stat_database.temp_bytesas an early proxy for queries approaching memory limits. - Test new queries with
EXPLAIN (ANALYZE, BUFFERS)in staging. InspectMemory:usage in hash and sort nodes. - Review per-table autovacuum settings so high-churn tables do not spawn excessive concurrent workers.
How Netdata helps
- Correlates system memory pressure (
mem.available,Used RAM) with PostgreSQL connection counts and active backends to highlight when private memory multipliers are the culprit. - Surfaces kernel OOM kill events alongside PostgreSQL process metrics so you can distinguish host-level OOM from cgroup
memory.maxkills in containers. - Tracks
pg_stat_database.temp_bytesto flag queries that are spilling to disk before they trigger an OOM. - Monitors autovacuum worker count and connection utilization to catch memory multiplier spikes early.
Related guides
- How PostgreSQL actually works in production: a mental model for operators
- PostgreSQL ALTER TABLE blocked: zero-downtime DDL patterns
- PostgreSQL autovacuum blocked by long-running transaction: detection and fix
- PostgreSQL autovacuum not running: detection, causes, and fixes
- PostgreSQL autovacuum tuning: per-table thresholds for high-churn workloads
- PostgreSQL blocking queries: finding the root blocker in a lock cascade
- PostgreSQL checkpoint storms: detection, causes, and tuning
- PostgreSQL: checkpoints are occurring too frequently – what to tune
- PostgreSQL connection exhaustion: detection, diagnosis, and prevention
- PostgreSQL connection refused: pg_hba, listen_addresses, and TCP diagnosis
- PostgreSQL: database is not accepting commands to avoid wraparound data loss
- PostgreSQL dead tuples piling up: why autovacuum can’t keep up






