PostgreSQL FATAL: too many connections - causes and fixes
Applications throw FATAL: sorry, too many clients already. Health checks fail. Retries amplify the problem. The database is not down, but it rejects new traffic.
PostgreSQL uses a process-per-connection model. Every backend holds memory and scheduler time even when idle. Raising max_connections usually deepens the problem because the root cause is why slots are occupied and what those backends are doing.
This guide shows how to diagnose connection exhaustion, distinguish PostgreSQL saturation from pooler exhaustion, and fix common root causes safely.
What this means
max_connections is a hard limit that requires a server restart to change. Once the count of active, idle, and idle-in-transaction backends reaches it, PostgreSQL refuses new TCP connections with a FATAL error. Existing connections keep running, but nothing new gets in.
Each backend holds process memory; an idle backend often holds a few megabytes of RSS, plus work_mem and temporary allocations when active. Context-switch overhead rises with process count, so performance degrades before the limit is hit.
PostgreSQL reserves a small emergency tier. superuser_reserved_connections (default 3) is subtracted from max_connections. When only those slots remain, only superusers can connect. In PostgreSQL 16 and later, reserved_connections adds a secondary tier for roles granted pg_use_reserved_connections. If regular users are locked out but psql -U postgres still works, you are in the reserved window, not a total crash.
Common causes
| Cause | What it looks like | First thing to check |
|---|---|---|
| Idle-in-transaction sessions | pg_stat_activity shows idle in transaction with old state_change timestamps. These hold row locks and block vacuum, so bloat accelerates. | SELECT count(*), max(now() - state_change) FROM pg_stat_activity WHERE state = 'idle in transaction'; |
| Missing or misconfigured connection pooler | Hundreds of idle backends. Throughput flattens as context-switch overhead rises. App opens a connection per request. | SELECT state, count(*) FROM pg_stat_activity GROUP BY state; and confirm whether PgBouncer or equivalent is deployed. |
| Deploy-time connection storm | Errors correlate with rolling deploys or pod restarts. New instances connect before old instances release. | SELECT application_name, count(*) FROM pg_stat_activity GROUP BY 1 ORDER BY 2 DESC; |
| PgBouncer pool exhaustion | Apps report too many clients, but PostgreSQL backend count is well below max_connections. | PgBouncer SHOW POOLS; output for queued or refused clients. |
| Superuser lockout | Monitoring and app users cannot connect, but a superuser connection succeeds. | Compare total backend count to max_connections - superuser_reserved_connections. |
Quick checks
Run these read-only commands before making changes.
# Connection state distribution
psql -c "SELECT state, count(*) FROM pg_stat_activity GROUP BY state ORDER BY count DESC;"
# Connections by application source
psql -c "SELECT application_name, count(*), state FROM pg_stat_activity GROUP BY 1, 3 ORDER BY 2 DESC;"
# Connection and reservation limits
psql -c "SHOW max_connections; SHOW superuser_reserved_connections;"
# Oldest idle-in-transaction sessions
psql -c "SELECT pid, usename, application_name, now() - state_change AS idle_time FROM pg_stat_activity WHERE state = 'idle in transaction' ORDER BY state_change LIMIT 10;"
# PgBouncer pool utilization if deployed
psql -p 6432 -U pgbouncer pgbouncer -c "SHOW POOLS;"
# Idle backends outside transactions
psql -c "SELECT count(*) FROM pg_stat_activity WHERE state = 'idle' AND backend_type = 'client backend';"
How to diagnose it
flowchart TD
A[FATAL: too many clients] --> B{pg_stat_activity
near max_connections?}
B -->|Yes| C[Classify states:
active idle idle_in_transaction]
B -->|No| D[Check PgBouncer
SHOW POOLS]
C --> E{idle in transaction
dominant?}
E -->|Yes| F[Kill old sessions
set timeout]
E -->|No| G{Idle dominant?}
G -->|Yes| H[Add or tune
connection pooler]
G -->|No| I[Check per-app
connection storms]
D --> J[Resize pool or
add pooler nodes]- Determine whether PostgreSQL or the pooler is the bottleneck. If
pg_stat_activitybackend counts are far belowmax_connections, check PgBouncerSHOW POOLS;forcl_waitingor exhaustedmax_client_conn. - Classify connection states.
activemeans executing a query.idleholds a process but no transaction.idle in transactionholds a snapshot and often row locks. When this last category dominates, it is usually the operational cause of exhaustion. - Map backends to sources. Use
application_nameto identify which service, batch job, or deploy is responsible for the spike. - Check for old transactions.
idle in transactionsessions older than a few minutes indicate an application leak, an uncommitted psql session, or an ORM that auto-begins transactions and never closes them. - Verify reserved slots. If only superusers can connect, the remaining free slots have dropped into the
superuser_reserved_connectionswindow. You need superuser access to clean up backends or restart the server. - Correlate with events. Connection storms often coincide with deploys, traffic spikes, or pool misconfigurations that create a thundering herd.
Metrics and signals to monitor
| Signal | Why it matters | Warning sign |
|---|---|---|
| Connection utilization | Percentage of max_connections in use, regardless of state. | Sustained above 70%, or any spike above 90%. |
| Idle-in-transaction count and age | These hold snapshots and may hold locks, blocking vacuum and DDL. | Any session older than 60 seconds. |
| Idle backend count | Indicates connection leaks or missing pooling. | Growing while application traffic is flat. |
| PgBouncer pool exhaustion | Front-end limit reached before the PostgreSQL limit. | Queued or refused clients in SHOW POOLS. |
| Backend memory | Each backend allocates process memory; OOM risk rises with count. | Resident set size growing proportionally to connections. |
Fixes
Terminate idle-in-transaction sessions
This is the most common immediate fix.
# WARNING: rolls back targeted sessions. Confirm they are not critical.
psql -c "SELECT pg_terminate_backend(pid) FROM pg_stat_activity WHERE state = 'idle in transaction' AND state_change < now() - interval '5 minutes';"
To prevent recurrence, set idle_in_transaction_session_timeout to 60-300 seconds. The default is 0 (disabled). For most OLTP workloads, 120 seconds is a safe starting point. This parameter does not require a restart and applies to new sessions immediately. When a session exceeds the timeout, PostgreSQL sends a FATAL error and rolls back the transaction.
Deploy a connection pooler
If your application opens direct connections and you routinely exceed a few hundred backends, deploy PgBouncer in transaction mode. Size PostgreSQL max_connections modestly (for example, 100-200) and let PgBouncer multiplex thousands of client connections. A reasonable starting point for default_pool_size is 2-3 times the number of CPU cores, but benchmark your workload.
Transaction mode breaks session-level features: prepared statements held across transactions, advisory locks, SET LOCAL, temporary tables, and LISTEN/NOTIFY. If your workload depends on these, use session mode or application-side pooling, but memory savings are smaller.
Fix deploy-time connection storms
Rolling deployments often create new pods before old ones drain. Mitigations:
- Drain connections in application shutdown hooks before the pod terminates.
- Set conservative maximums in application connection pools.
- Stagger rollouts or reduce max surge so total concurrent pods do not spike connection count.
Resize PgBouncer pools
If PostgreSQL has free backends but PgBouncer rejects clients, raise default_pool_size, reserve_pool_size, or add PgBouncer instances behind a load balancer. Ensure server_idle_timeout is not holding stale backends indefinitely.
Adjust reserved connections
If you need guaranteed superuser access during saturation, increase superuser_reserved_connections and restart PostgreSQL. In PostgreSQL 16 and later, consider granting pg_use_reserved_connections to monitoring roles and setting reserved_connections so observability agents can connect during saturation.
Prevention
- Deploy PgBouncer before traffic scales. Pooling is essential once you exceed roughly 50-100 concurrent application connections.
- Set timeouts from day one. Configure
idle_in_transaction_session_timeout = '2min',idle_session_timeout = '30min'on PostgreSQL 14 and later, andstatement_timeoutbased on your worst-case acceptable query duration. - Cap application pools. Application-side pool maximums should leave headroom below
max_connections. If you have 10 application servers with a pool max of 20, you already risk 200 concurrent connections. Never let total app max pools sum to the server limit. - Monitor per application. Tag connections with meaningful
application_namevalues so you can identify which service is leaking or storming. - Test deploy patterns in staging. Validate that rolling restarts do not spike backend counts.
How Netdata helps
- Correlates PostgreSQL connection utilization with query latency and active backend counts on the same timeline, so you can see whether saturation causes slowdowns.
- Surfaces
idle in transactioncounts separately fromidlecounts, making leaks visible before exhaustion. - Alerts on connection utilization spikes and on sessions that sit idle in transaction past a threshold.
- When PgBouncer is monitored, shows pool exhaustion alongside PostgreSQL backend counts, helping you distinguish pooler saturation from PostgreSQL saturation.






