NGINX active connections climbing: reading, writing, waiting explained

When operators see Active connections climbing in stub_status, the first instinct is often to add capacity. That instinct is usually wrong. The stub_status module exposes exactly seven metrics, and the most useful of them is the breakdown of active connections into Reading, Writing, and Waiting. The absolute number of active connections is almost meaningless without the ratio between these three states. A server with 10,000 active connections where 9,000 are Waiting is healthy. A server with 500 active connections where 400 are Reading may be under a slowloris-style attack.

This article explains how nginx assigns connections to each state, what the ratios reveal about your traffic, and why high numbers in one bucket are healthy while high numbers in another are pathological. If you understand these three numbers, you can stop chasing phantom upstream latency and start diagnosing the real bottleneck.

What active connections actually measure

Active connections is the sum of all client sockets currently allocated by nginx worker processes. This includes connections that are mid-request, connections proxying to upstream, and idle keepalive connections waiting for the next HTTP request. It does not include half-open TCP connections sitting in the kernel backlog, nor does it count upstream-only sockets that have not yet been paired with a client request.

Each active connection consumes one slot against worker_connections (default 512 per worker) and one file descriptor. In reverse proxy mode, every request occupies at least two slots: one for the client-facing socket and one for the upstream socket. This means effective proxy capacity is at most half the configured limit, minus whatever idle keepalive sockets occupy on either side. Teams that set worker_connections to 1024 and expect 1,024 simultaneous proxied requests are off by a factor of two before accounting for keepalive.

HTTP/2 multiplexes many requests over a single TCP connection, so a lone active connection in Writing may actually carry dozens of concurrent streams. WebSocket connections count as a single active connection for their entire lifetime regardless of message rate. The active connection count is therefore a measure of socket occupancy, not request concurrency.

Because stub_status samples the state machine at the moment you query it, the numbers are point-in-time snapshots. A connection may transition from Reading to Writing to Waiting in a few milliseconds under normal load. Use sampling intervals of at least five to ten seconds and look at trends, not individual readings.

How the three states work

flowchart LR
  A[New connection] --> B[Reading]
  B -->|request complete| C[Processing]
  C -->|upstream wait or response send| D[Writing]
  D -->|keepalive enabled| E[Waiting]
  E -->|next request| B
  D -->|connection close| F[Close]
  E -->|keepalive timeout| F

Reading. The connection is in Reading while nginx consumes bytes from the client. This phase covers the request line, headers, and request body. In normal operation, a connection should transition through Reading in milliseconds. A connection that stays in Reading for seconds or minutes is either receiving a large upload or is stalled waiting for bytes from a slow client.

Writing. The connection enters Writing once nginx finishes reading the request and begins producing a response. This is the most misleading label in stub_status. Writing also includes connections where nginx has finished reading the request and is waiting for an upstream server to respond. During proxying, the connection remains in Writing while the worker buffers the upstream response or streams it to the client. A proxy deployment may spend most of its Writing time waiting for $upstream_header_time, not transmitting bytes to the client.

Waiting. These are idle keepalive connections. Mathematically, Waiting equals Active connections minus Reading minus Writing. They hold open file descriptors and a small amount of memory but consume almost no CPU. A high Waiting count is evidence that clients are efficiently reusing connections, which is the intended behavior of HTTP/1.1 keepalive and HTTP/2. Waiting only becomes a problem when it consumes so many slots that new connections cannot be accepted.

You can sample the breakdown manually from the status endpoint:

# Check current state breakdown
curl -s http://127.0.0.1/nginx_status | awk '/Reading/ {print "R:"$2, "W:"$4, "Wait:"$6}'

What the ratios tell you in production

Use ratios, not absolutes. A baseline of 10,000 active connections tells you nothing. A baseline where 80% are Waiting, 15% are Writing, and 5% are Reading describes a healthy keepalive-heavy workload. The table below maps patterns to their likely operational meaning.

Pattern	Likely meaning	Correlation to check
Waiting is 70-90% of Active	Healthy keepalive reuse	Connection slot utilization; if near limit, reduce `keepalive_timeout` instead of adding capacity
Writing is > 50% of Active, normal upstream latency	Slow client downstream (bandwidth or ACK throttling)	Gap between `$request_time` and `$upstream_response_time`
Writing is > 50% of Active, high upstream latency	Backend bottleneck	Upstream response time P95 and upstream header time
Reading is > 20% of Active, low request rate	Slowloris attack or stalled uploads	Per-IP connection concentration from `ss` or access logs
Reading spikes briefly during traffic burst	Legitimate connection initiation	Request rate spike that transitions into Writing within seconds

When Writing dominates and upstream latency is elevated, the bottleneck is behind nginx. The worker is holding the connection open waiting for the upstream application. If upstream latency is normal but $request_time is much larger than $upstream_response_time, the client is slow to receive the response. The Writing state does not distinguish these two cases on its own; you must correlate with upstream timing.

When Reading dominates with flat or falling throughput, clients are not completing their requests. During a slowloris attack, connections stay in Reading indefinitely because the attacker sends partial headers slowly. Lower client_header_timeout and client_body_timeout from their default 60 seconds and use limit_conn to cap concurrent connections per source IP.

Common misinterpretations

High active connections means imminent overload. This is only true if the count approaches worker_connections × worker_processes. If Waiting dominates the ratio, the sockets are idle and efficient. Check connection slot utilization before ordering more capacity.

High Writing means nginx is slow. Writing measures time spent waiting for upstream or flushing to the client. Nginx itself is event-driven and non-blocking. If Writing is high, look at the backend or the client’s network, not the nginx host.

Reading high means heavy request traffic. Sustained high Reading with a flat request rate means bytes are trickling in slowly, not that the server is busy processing. Large legitimate uploads are an exception, but those correlate with specific endpoints and content lengths.

Waiting connections are a leak. They are the intended behavior of keepalive. They become a problem only when they crowd out capacity. If Waiting approaches your theoretical maximum, tune keepalive_timeout or keepalive_requests before raising worker_connections.

The default worker_connections is 1024. The actual default is 512. Many tutorials perpetuate the 1024 myth. Verify your configured ceiling with nginx -T | grep worker_connections rather than assuming.

Signals to watch in production

Signal	Why it matters	Warning sign
Reading / Active ratio	Reveals stalled request intake or slowloris	Sustained > 20% without a large upload workload
Writing / Active ratio	Reveals downstream or upstream bottlenecks	Sustained > 50% with flat or falling request rate
Waiting / Active ratio	Measures keepalive efficiency	70-90% is normal; approaching 100% may crowd capacity
Connection slot utilization	Hard capacity ceiling	Active / (worker_connections × worker_processes) > 80%
Request rate	Distinguishes live traffic from stuck sockets	Low RPS plus high Active means connections are blocked
Accepts vs. handled gap	Confirmed admission loss	Growing gap means nginx is dropping new connections

How Netdata helps

Netdata exposes Reading, Writing, and Waiting as separate dimensions under the nginx collector, making ratio shifts visible in real time without parsing stub_status by hand.
Correlate the connection state breakdown with requests per second on the same dashboard. A divergence between active connections and throughput immediately reveals stuck sockets.
Alert on Reading dominance sustained for multiple minutes alongside low request throughput to catch slowloris patterns before connection slots exhaust.
Track connection slot utilization to distinguish healthy keepalive accumulation from genuine capacity pressure.
Cross-reference Writing spikes with upstream response time metrics to isolate backend slowdowns from client bandwidth limitations.

The Netdata solution

Web server monitoring with Netdata

Netdata monitors NGINX with per-second request, connection, and latency metrics plus ML anomaly detection. Correlate connection and file-descriptor exhaustion, upstream cascade failures, buffer spill, and TLS CPU with the host signals behind them.

See web server monitoring → Start monitoring free

NGINX active connections climbing: reading, writing, waiting explained

NGINX active connections climbing: reading, writing, waiting explained

What active connections actually measure

How the three states work

What the ratios tell you in production

Common misinterpretations

Signals to watch in production

How Netdata helps

Related guides

Web server monitoring with Netdata