NGINX slowloris and slow-client attacks: detection and mitigation
You check stub_status and the Reading count is triple its normal value and not dropping. Requests per second has collapsed to near zero. Active connections are climbing toward worker_connections * worker_processes while worker CPU stays idle. This is not a backend slowdown. It is a slowloris or slow-client attack: connections open faster than they complete, and NGINX waits for data that arrives one byte at a time.
The default client_header_timeout and client_body_timeout of 60 seconds are too generous for most production traffic. client_body_timeout resets on every successive read, so a client sending one byte every 50 seconds can hold a slot indefinitely. Once enough slots are occupied, legitimate clients cannot connect. The symptom is connection exhaustion, but the root cause is behavioral: attackers abuse the wait state, not bandwidth.
What this means
NGINX allocates a connection structure and a worker event-loop slot for every accepted TCP connection. Normally, a connection moves quickly from Reading (receiving headers or body) to Writing (sending a response) or Waiting (keepalive idle). In a slowloris attack, the client sends headers or body so slowly that the connection stays in Reading for minutes. The worker does not stall because the event loop is non-blocking, but the slot is tied up. When all slots are consumed, the kernel backlog fills and overflows. New connections are then dropped before NGINX can log them.
flowchart TD
A[Client opens connection] --> B{Sends data slowly?}
B -->|Headers| C[NGINX stays in Reading]
B -->|Body| C
C --> D[Connection slot consumed]
D --> E{All slots full?}
E -->|No| C
E -->|Yes| F[Kernel drops new connections]
F --> G[Legitimate clients timeout]Common causes
| Cause | What it looks like | First thing to check |
|---|---|---|
| Slowloris header attack | Reading count sustained high; few source IPs hold many connections each; request rate near zero | stub_status Reading ratio and ss peer IP concentration |
| Slow HTTP POST body attack | Reading high during body transfer; throughput collapsed; body arrives byte-by-byte | client_body_timeout value and access log $request_time |
| Slow read attack | Writing count sustained high; upstream response time normal; total request time inflated | $request_time minus $upstream_response_time gap |
| Legitimate slow clients or large uploads | Reading elevated from many diverse IPs; correlates with known upload endpoints | URI patterns and geographic distribution of source IPs |
Quick checks
# Check connection state breakdown from stub_status
curl -s http://127.0.0.1/nginx_status | awk '/Reading/ {print "R:"$2, "W:"$4, "Wait:"$6}'
# Identify source IPs with the most established connections
ss -tn state established '( dport = :80 or dport = :443 )' \
| awk '{print $5}' \
| sed 's/]:[0-9]*$/]/; s/:[0-9]*$//' \
| sort | uniq -c | sort -rn | head
# Inspect current timeout directives (warning: nginx -T dumps full config, including secrets)
nginx -T 2>/dev/null | grep -E 'client_header_timeout|client_body_timeout|reset_timedout_connection'
# Calculate connection slot utilization
active=$(curl -s http://127.0.0.1/nginx_status | awk '/Active/ {print $3}')
workers=$(pgrep -c -P $(cat /var/run/nginx.pid))
wc=$(nginx -T 2>/dev/null | grep -m1 'worker_connections' | awk '{print $2}' | tr -d ';')
wc=${wc:-512}
max=$((workers * wc))
echo "Utilization: $(echo "scale=1; $active * 100 / $max" | bc)% ($active / $max)"
# Check if NGINX is already dropping connections
curl -s http://127.0.0.1/nginx_status | awk '/^[[:space:]]*[0-9]/ {print "gap=" $1-$2; exit}'
# Check error log for limit or resource exhaustion messages (adjust path if needed)
tail -500 /var/log/nginx/error.log | grep -E 'limiting|accept4\(\) failed|too many open files'
How to diagnose it
- Confirm Reading dominance. A brief spike in Reading is normal during traffic bursts. Sustained Reading above 20% of active connections without a corresponding Writing spike signals a slow-client attack.
- Check for throughput collapse. Sample the stub_status requests counter twice, one second apart. If active connections are high but completions per second have collapsed, slots are occupied by incomplete requests.
- Identify source IP concentration. Use
ssto list peer addresses in ESTABLISHED state. If a handful of IPs hold 50 or more connections each while legitimate traffic typically shows 1-5 connections per IP, you have an attack. - Verify timeout settings. Run
nginx -T | grep client_header_timeout. Empty output means the 60-second default is in effect. That is too long for most internet-facing applications. - Distinguish from backend slowness. High Writing and elevated
$upstream_response_timepoint to an upstream bottleneck. In a slowloris attack, Writing is low and upstream time is irrelevant because the request never reaches the upstream phase. - Check admission loss. An increasing
accepts - handledgap or climbingTcpExtListenOverflowsmeans capacity is exhausted and the kernel is silently dropping new connections.
Metrics and signals to monitor
| Signal | Why it matters | Warning sign |
|---|---|---|
| Reading / active connections ratio | Direct signature of slow-client attacks | >20% sustained for more than 5 minutes |
| Active connections vs maximum capacity | Cliff-edge exhaustion indicator | >80% of worker_connections * worker_processes |
| Requests per second | Throughput reality check | Falling while active connections rise |
| Accepts minus handled gap | Proof that admission control is failing | Increasing over successive samples |
| Connection slot utilization | Overall capacity headroom | >75% sustained |
| Per-IP connection count | Distinguishes attack concentration from organic growth | Single IP holding >50 connections |
Fixes
Reduce client timeouts
Lower client_header_timeout and client_body_timeout from the 60-second default to values that match your traffic. For typical HTTP APIs, 10-15 seconds is enough. For file uploads, use a location-specific override. Remember that client_body_timeout resets on each successive read. A client sending one byte every 9 seconds will never time out with a 10-second threshold. If you must support slow uploads, pair aggressive timeouts with strict per-IP connection limits instead of relying on timeout generosity alone. Test changes in staging first; a reload applies the new timeout to new connections only.
Enable reset_timedout_connection
Set reset_timedout_connection on;. By default, NGINX closes timed-out connections with a TCP FIN. If the client is unresponsive, the socket can linger in FIN_WAIT1, consuming file descriptors and memory. reset_timedout_connection forces a TCP RST, reclaiming resources immediately. The tradeoff is an abrupt client termination, which is acceptable during an attack.
Enforce per-IP connection limits
Add a shared memory zone and a limit_conn rule:
limit_conn_zone $binary_remote_addr zone=addr:10m;
limit_conn addr 50;
Any IP exceeding 50 concurrent connections receives a 503 response. The tradeoff is that NATed users behind a corporate or mobile gateway share an IP. If your user base is heavily NATed, a low limit will block legitimate users. Start with a value above your normal per-IP peak and tune downward. limit_conn_zone returns an error immediately without eviction if the zone fills, so size the zone for at least 2x your expected peak unique IP count.
Block at the firewall
If ss identifies specific attacking IPs, block them at the host firewall or upstream edge. This is faster than allowing the traffic to reach NGINX. For immediate host-level relief, use iptables -A INPUT -s <ip> -j DROP or equivalent. Distribute the block to your edge firewall if the attack volume threatens NIC saturation before it reaches the host TCP stack. The tradeoff is that distributed attacks rotate IPs, so firewall rules are temporary relief, not a structural fix.
Increase worker_connections (emergency only)
Raising worker_connections and reloading provides immediate headroom, but only buys time. It does not fix the attack. Every proxied request consumes two connection slots (client-facing plus upstream), so effective proxy capacity is half the configured value.
Prevention
- Set
client_header_timeoutandclient_body_timeoutto values that match your traffic profile. Do not leave the 60-second defaults on internet-facing servers. - Configure
limit_conn_zoneandlimit_connbefore an attack occurs. Monitor zone allocation errors (could not allocate nodein the error log) to ensure the zone is large enough. - Enable
reset_timedout_connection onto prevent FIN_WAIT1 accumulation. - Monitor the Reading/active ratio proactively. A slow rise is easier to catch than a capacity cliff.
- Size
worker_connectionsfor peak traffic plus attack headroom, and ensureworker_rlimit_nofileis at least double that to accommodate upstream connections, log files, and temp files.
How Netdata helps
Netdata collects stub_status Reading, Writing, and Waiting states every second. A Reading spike is visible as it happens. Active connections and slot-utilization metrics include pre-built thresholds for danger zones above 80%. Netdata tracks the accepts-handled gap and alerts when NGINX starts dropping connections. It correlates NGINX connection states with kernel TCP metrics like listen queue overflows, so you can confirm whether the kernel is already dropping connections.
Related guides
- How NGINX actually works in production: a mental model for operators
- NGINX active connections climbing: reading, writing, waiting explained
- NGINX connection exhaustion: detection, diagnosis, and prevention
- NGINX dropped connections: the accepts vs handled gap
- NGINX monitoring checklist: the signals every production server needs
- NGINX monitoring maturity model: from survival to expert
- nginx: worker_connections are not enough - causes and fixes
- NGINX worker_connections and worker_processes: sizing for real traffic







