$ guides / postgres / postgres-too-many-connections ▌

Operations Guides

PostgreSQL FATAL: Too Many Connections - Causes & Fixes

Applications throw FATAL: sorry, too many clients already. Health checks fail. Retries amplify the problem. The database is not down, but it rejects new traffic.

PostgreSQL uses a process-per-connection model. Every backend holds memory and scheduler time even when idle. Raising max_connections usually deepens the problem because the root cause is why slots are occupied and what those backends are doing.

This guide shows how to diagnose connection exhaustion, distinguish PostgreSQL saturation from pooler exhaustion, and fix common root causes safely.

What This Means

max_connections is a hard limit that requires a server restart to change. Once the count of active, idle, and idle-in-transaction backends reaches it, PostgreSQL refuses new TCP connections with a FATAL error. Existing connections keep running, but nothing new gets in.

Each backend holds process memory; an idle backend often holds a few megabytes of RSS, plus work_mem and temporary allocations when active. Context-switch overhead rises with process count, so performance degrades before the limit is hit.

PostgreSQL reserves a small emergency tier. superuser_reserved_connections (default 3) is subtracted from max_connections. When only those slots remain, only superusers can connect. In PostgreSQL 16 and later, reserved_connections adds a secondary tier for roles granted pg_use_reserved_connections. If regular users are locked out but psql -U postgres still works, you are in the reserved window, not a total crash.

Common Causes

Cause	What it looks like	First thing to check
Idle-in-transaction sessions	`pg_stat_activity` shows `idle in transaction` with old `state_change` timestamps. These hold row locks and block vacuum, so bloat accelerates.	`SELECT count(*), max(now() - state_change) FROM pg_stat_activity WHERE state = 'idle in transaction';`
Missing or misconfigured connection pooler	Hundreds of `idle` backends. Throughput flattens as context-switch overhead rises. App opens a connection per request.	`SELECT state, count(*) FROM pg_stat_activity GROUP BY state;` and confirm whether PgBouncer or equivalent is deployed.
Deploy-time connection storm	Errors correlate with rolling deploys or pod restarts. New instances connect before old instances release.	`SELECT application_name, count(*) FROM pg_stat_activity GROUP BY 1 ORDER BY 2 DESC;`
PgBouncer pool exhaustion	Apps report too many clients, but PostgreSQL backend count is well below `max_connections`.	PgBouncer `SHOW POOLS;` output for queued or refused clients.
Superuser lockout	Monitoring and app users cannot connect, but a superuser connection succeeds.	Compare total backend count to `max_connections - superuser_reserved_connections`.

Quick Checks

Run these read-only commands before making changes.

# Connection state distribution
psql -c "SELECT state, count(*) FROM pg_stat_activity GROUP BY state ORDER BY count DESC;"

# Connections by application source
psql -c "SELECT application_name, count(*), state FROM pg_stat_activity GROUP BY 1, 3 ORDER BY 2 DESC;"

# Connection and reservation limits
psql -c "SHOW max_connections; SHOW superuser_reserved_connections;"

# Oldest idle-in-transaction sessions
psql -c "SELECT pid, usename, application_name, now() - state_change AS idle_time FROM pg_stat_activity WHERE state = 'idle in transaction' ORDER BY state_change LIMIT 10;"

# PgBouncer pool utilization if deployed
psql -p 6432 -U pgbouncer pgbouncer -c "SHOW POOLS;"

# Idle backends outside transactions
psql -c "SELECT count(*) FROM pg_stat_activity WHERE state = 'idle' AND backend_type = 'client backend';"

How To Diagnose It

flowchart TD
    A[FATAL: too many clients] --> B{pg_stat_activity
near max_connections?}
    B -->|Yes| C[Classify states:
active idle idle_in_transaction]
    B -->|No| D[Check PgBouncer
SHOW POOLS]
    C --> E{idle in transaction
dominant?}
    E -->|Yes| F[Kill old sessions
set timeout]
    E -->|No| G{Idle dominant?}
    G -->|Yes| H[Add or tune
connection pooler]
    G -->|No| I[Check per-app
connection storms]
    D --> J[Resize pool or
add pooler nodes]

Determine whether PostgreSQL or the pooler is the bottleneck. If pg_stat_activity backend counts are far below max_connections, check PgBouncer SHOW POOLS; for cl_waiting or exhausted max_client_conn.
Classify connection states. active means executing a query. idle holds a process but no transaction. idle in transaction holds a snapshot and often row locks. When this last category dominates, it is usually the operational cause of exhaustion.
Map backends to sources. Use application_name to identify which service, batch job, or deploy is responsible for the spike.
Check for old transactions. idle in transaction sessions older than a few minutes indicate an application leak, an uncommitted psql session, or an ORM that auto-begins transactions and never closes them.
Verify reserved slots. If only superusers can connect, the remaining free slots have dropped into the superuser_reserved_connections window. You need superuser access to clean up backends or restart the server.
Correlate with events. Connection storms often coincide with deploys, traffic spikes, or pool misconfigurations that create a thundering herd.

Metrics & Signals To Monitor

Signal	Why it matters	Warning sign
Connection utilization	Percentage of `max_connections` in use, regardless of state.	Sustained above 70%, or any spike above 90%.
Idle-in-transaction count and age	These hold snapshots and may hold locks, blocking vacuum and DDL.	Any session older than 60 seconds.
Idle backend count	Indicates connection leaks or missing pooling.	Growing while application traffic is flat.
PgBouncer pool exhaustion	Front-end limit reached before the PostgreSQL limit.	Queued or refused clients in `SHOW POOLS`.
Backend memory	Each backend allocates process memory; OOM risk rises with count.	Resident set size growing proportionally to connections.

Fixes

Terminate Idle-In-Transaction Sessions

This is the most common immediate fix.

# WARNING: rolls back targeted sessions. Confirm they are not critical.
psql -c "SELECT pg_terminate_backend(pid) FROM pg_stat_activity WHERE state = 'idle in transaction' AND state_change < now() - interval '5 minutes';"

To prevent recurrence, set idle_in_transaction_session_timeout to 60-300 seconds. The default is 0 (disabled). For most OLTP workloads, 120 seconds is a safe starting point. This parameter does not require a restart and applies to new sessions immediately. When a session exceeds the timeout, PostgreSQL sends a FATAL error and rolls back the transaction.

Deploy A Connection Pooler

If your application opens direct connections and you routinely exceed a few hundred backends, deploy PgBouncer in transaction mode. Size PostgreSQL max_connections modestly (for example, 100-200) and let PgBouncer multiplex thousands of client connections. A reasonable starting point for default_pool_size is 2-3 times the number of CPU cores, but benchmark your workload.

Transaction mode breaks session-level features: prepared statements held across transactions, advisory locks, SET LOCAL, temporary tables, and LISTEN/NOTIFY. If your workload depends on these, use session mode or application-side pooling, but memory savings are smaller.

Fix Deploy-Time Connection Storms

Rolling deployments often create new pods before old ones drain. Mitigations:

Drain connections in application shutdown hooks before the pod terminates.
Set conservative maximums in application connection pools.
Stagger rollouts or reduce max surge so total concurrent pods do not spike connection count.

Resize PgBouncer Pools

If PostgreSQL has free backends but PgBouncer rejects clients, raise default_pool_size, reserve_pool_size, or add PgBouncer instances behind a load balancer. Ensure server_idle_timeout is not holding stale backends indefinitely.

Adjust Reserved Connections

If you need guaranteed superuser access during saturation, increase superuser_reserved_connections and restart PostgreSQL. In PostgreSQL 16 and later, consider granting pg_use_reserved_connections to monitoring roles and setting reserved_connections so observability agents can connect during saturation.

Prevention

Deploy PgBouncer before traffic scales. Pooling is essential once you exceed roughly 50-100 concurrent application connections.
Set timeouts from day one. Configure idle_in_transaction_session_timeout = '2min', idle_session_timeout = '30min' on PostgreSQL 14 and later, and statement_timeout based on your worst-case acceptable query duration.
Cap application pools. Application-side pool maximums should leave headroom below max_connections. If you have 10 application servers with a pool max of 20, you already risk 200 concurrent connections. Never let total app max pools sum to the server limit.
Monitor per application. Tag connections with meaningful application_name values so you can identify which service is leaking or storming.
Test deploy patterns in staging. Validate that rolling restarts do not spike backend counts.

How Netdata Helps

Correlates PostgreSQL connection utilization with query latency and active backend counts on the same timeline, so you can see whether saturation causes slowdowns.
Surfaces idle in transaction counts separately from idle counts, making leaks visible before exhaustion.
Alerts on connection utilization spikes and on sessions that sit idle in transaction past a threshold.
When PgBouncer is monitored, shows pool exhaustion alongside PostgreSQL backend counts, helping you distinguish pooler saturation from PostgreSQL saturation.

PostgreSQL monitoring checklist: the signals every production database needs

Find more guides here.

The Netdata solution

PostgreSQL monitoring with Netdata

Netdata monitors PostgreSQL with per-second metrics, pre-built dashboards, and ML-powered anomaly detection. Correlate connection saturation, lock waits, autovacuum progress, replication lag, and checkpoint I/O against the rest of your stack so you catch the incidents in these runbooks before they page anyone.

See PostgreSQL monitoring → Start monitoring free