$ guides / nginx / nginx-worker-connections-are-not-enough ▌

Operations Guides

nginx: worker_connections are not enough — causes and fixes

Your error log shows worker_connections are not enough while connecting to upstream. New clients time out while existing connections may still work. This is a hard capacity cliff: once a worker exhausts its connection slots, it cannot accept new connections until a slot frees. The default limit is 512 per worker, not 1024, and in reverse-proxy mode each request consumes at least two slots. Raising the number is often the first reaction, but if a slow backend is holding connections open, the slots will fill again no matter how high you set the limit.

What this means

worker_connections is a per-worker hard ceiling inside the events block. Every client connection, upstream connection, and idle keepalive connection counts as one slot. When a worker hits the limit, the kernel may still complete TCP handshakes and queue them in the listen backlog, but nginx cannot accept them. The stub_status counters reveal this through a growing gap between accepts and handled.

For reverse-proxy workloads, effective capacity is at most half of worker_connections * worker_processes because each proxied request ties up one slot on the client side and one on the upstream side. File descriptors impose a second, independent ceiling. Each connection needs an FD, so worker_rlimit_nofile must be at least worker_connections * 2 plus headroom for logs and temp files. nginx validates this relationship at startup and emits a warning if the FD limit is lower, but it will still start.

flowchart TD
    A[Client request arrives] --> B{Worker has free slot?}
    B -->|Yes| C[Accept client connection]
    C --> D[Open upstream connection]
    D --> E[Process response]
    B -->|No| F[Connection dropped]
    F --> G[accepts > handled gap grows]
    E --> H{Backend slow?}
    H -->|Yes| I[Slot held open]
    I --> B
    H -->|No| J[Close or keepalive]
    J --> B

Common causes

Cause	What it looks like	First thing to check
Slow backend holding connections	`Writing` state dominates; `$upstream_response_time` climbs; error log shows `upstream timed out`	`stub_status` state breakdown and access log upstream latency
`worker_connections` too low for proxy load	Active connections flat near `worker_connections * worker_processes`; `accepts` exceeds `handled`	`stub_status` and `nginx -T` for the configured limit
File descriptor limit below connection limit	`accept4() failed (24: Too many open files)` in error log; FD count near limit per worker	`/proc/<worker_pid>/limits` or `prlimit`
Excessive keepalive idle connections	`Waiting` state dominates; low request rate but high active connections	`keepalive_timeout` and `keepalive_requests` values
Traffic spike or connection flood	Sudden spike in active connections; `Reading` state may rise	Request rate from access log and `ss` SYN queue state

Quick checks

# Confirm the error and frequency
grep -c "worker_connections are not enough" /var/log/nginx/error.log

# Check connection counts and accepts vs handled gap
curl -s http://127.0.0.1/nginx_status

# Inspect configured limits
nginx -T 2>/dev/null | grep -E 'worker_connections|worker_rlimit_nofile|worker_processes'

# File descriptor usage per worker against its limit
for pid in $(pgrep -P $(cat /var/run/nginx.pid)); do
  used=$(ls /proc/$pid/fd 2>/dev/null | wc -l)
  max=$(awk '/^Max open files/ {print $4}' /proc/$pid/limits)
  echo "Worker $pid: $used / $max FDs"
done

# Kernel-level listen drops (silent to nginx)
nstat -az TcpExtListenOverflows 2>/dev/null | awk '/ListenOverflows/ {print $2}'

# Upstream latency in recent requests (requires $upstream_response_time in log_format)
tail -n 5000 /var/log/nginx/access.log | awk '{print $NF}' | sort -n | \
  awk 'BEGIN{c=0} {a[c++]=$1} END{print "p95:", a[int(c*0.95)], "p99:", a[int(c*0.99)]}'

How to diagnose it

Verify the symptom. Search the error log for worker_connections are not enough. Note whether it correlates with traffic spikes or appears continuously under normal load.
Calculate slot utilization. Pull Active connections from stub_status. Divide by worker_connections * worker_processes. If this is above 80%, you are in the danger zone. For reverse proxy, divide by an additional factor of two.
Check for admission loss. Compare the accepts and handled counters in stub_status. If accepts exceeds handled and the gap is growing, nginx is actively dropping connections.
Check the FD ceiling. Inspect /proc/<worker_pid>/limits for Max open files. If FD usage is within 20% of the limit, the real bottleneck is worker_rlimit_nofile or the OS ulimit, not worker_connections.
Identify the connection state mix. High Writing with elevated $upstream_response_time points to slow backends. High Waiting with flat traffic points to keepalive hoarding. High Reading without corresponding throughput points to slow clients or a flood of new connections.
Look for kernel drops. Check TcpExtListenOverflows. If it is increasing, the kernel is dropping SYNs or completed connections before nginx can accept them. This produces client-side timeouts with no entry in nginx logs.
Correlate with backend health. If upstream latency rose before the error appeared, the backend is the root cause. Fixing the limit without fixing the backend will only postpone the next outage.

Metrics and signals to monitor

Signal	Why it matters	Warning sign
Connection slot utilization	Reveals how close you are to the hard per-worker cap	>80% of `worker_connections * worker_processes` sustained
Dropped connections (`accepts - handled` gap)	The earliest indicator that nginx is refusing work	Gap increasing over a 60-second window
Active connection state breakdown	Distinguishes backend slowness from keepalive bloat	`Writing` >60% of active with low throughput
File descriptor usage per worker	FD exhaustion mimics connection exhaustion	>80% of `worker_rlimit_nofile` or OS limit
Upstream response time (P95)	Slow backends are the most common hidden cause	P95 trending up or >80% of `proxy_read_timeout`
Kernel listen overflows	Connections dropped before nginx sees them	`TcpExtListenOverflows` counter increasing

Fixes

Raise the limit correctly

If utilization is genuinely too low for your traffic, increase worker_connections. Remember the default is 512, not 1024. In nginx.conf, inside the events block:

worker_connections 4096;
worker_rlimit_nofile 8192;

For a reverse proxy, worker_rlimit_nofile should be at least double worker_connections to cover client and upstream sockets plus logs and temp files. Reload with nginx -s reload; a restart is not required. If nginx runs under systemd, verify that LimitNOFILE in the service unit is not overriding your config.

Fix the backend first

If Writing connections are high and upstream latency is elevated, raising worker_connections will only delay the inevitable. The correct fix is to restore backend performance. As an emergency mitigation, you can reduce proxy_read_timeout so nginx gives up on slow upstreams faster. This trades 504 Gateway Time-outs for freed connection slots, which is usually preferable to total connection exhaustion.

Reclaim keepalive capacity

If Waiting connections dominate, lower keepalive_timeout or keepalive_requests to recycle idle slots faster. In upstream blocks, ensure the keepalive pool size matches your concurrency; an oversized pool wastes slots, while an undersized pool causes excessive upstream connection churn.

Address file descriptor limits

When the error log shows accept4() failed (24: Too many open files), the FD limit is the real ceiling, not worker_connections. Raise worker_rlimit_nofile and verify the OS soft limit with prlimit -n -p <worker_pid>. If systemd manages the process, create a drop-in override for LimitNOFILE, run systemctl daemon-reload, and restart nginx.

Prevention

Size worker_connections for at least 2x your peak proxied request concurrency, then add headroom for keepalive idle connections and WebSockets.
Set worker_rlimit_nofile to at least twice worker_connections, and verify it is not overridden by the init system or container runtime.
Monitor the accepts - handled gap from stub_status. A nonzero rate is a leading indicator that appears before users complain.
Monitor upstream response time percentiles. Rising backend latency is the most reliable predictor of future connection exhaustion.
Set worker_shutdown_timeout so old workers do not linger indefinitely after reloads, hoarding slots and FDs on long-lived connections.

How Netdata helps

Correlates nginx active connections, connection state breakdown, and the accepts-handled gap in one view so you can spot admission loss immediately.
Tracks per-process file descriptor usage and warns when workers approach the worker_rlimit_nofile ceiling.
Surfaces upstream response time metrics alongside nginx connection metrics, making it obvious when a slow backend is the root cause.
Monitors kernel-level TcpExtListenOverflows to catch silent drops that never appear in nginx logs.
Alerts on connection slot utilization ratio so you can act before the hard limit is reached.

The Netdata solution

Web server monitoring with Netdata

Netdata monitors NGINX with per-second request, connection, and latency metrics plus ML anomaly detection. Correlate connection and file-descriptor exhaustion, upstream cascade failures, buffer spill, and TLS CPU with the host signals behind them.

See web server monitoring → Start monitoring free

nginx: worker_connections are not enough — causes and fixes

nginx: worker_connections are not enough — causes and fixes

What this means

Common causes

Quick checks

How to diagnose it

Metrics and signals to monitor

Fixes

Raise the limit correctly

Fix the backend first

Reclaim keepalive capacity

Address file descriptor limits

Prevention

How Netdata helps

Related guides

Web server monitoring with Netdata