nginx upstream prematurely closed connection while reading response header
upstream prematurely closed connection while reading response header from upstream means the upstream server closed the TCP socket while nginx was still reading response headers. This produces a 502 Bad Gateway. Unlike a timeout, the upstream actively terminated the connection.
The root cause is typically on the backend: a crash, worker recycle, request size limit, or stale keepalive connection the backend closed while nginx tried to reuse it. nginx retries the request on another backend only if proxy_next_upstream includes error (the default for idempotent methods). Retries improve availability but do not fix the underlying issue.
What this means
When nginx proxies a request, it opens or reuses a TCP connection to an upstream, sends the request, and waits for response headers. If the upstream closes the connection before nginx finishes reading those headers, nginx logs this error and returns 502. The upstream terminated the socket mid-response.
This can happen on a brand-new connection or on a reused keepalive connection. In the keepalive case, the backend decided the connection was idle for too long and closed it, but nginx still had the socket in its pool and handed it to a worker. Because the close happens while nginx is reading, it is logged as a premature close rather than a connect failure.
flowchart TD
A[502 + upstream prematurely closed connection] --> B{Correlate with reload or deploy?}
B -->|Yes| C[Worker or backend recycled]
B -->|No| D{Backend crashing?}
D -->|Yes| E[Fix application crash or OOM]
D -->|No| F{Keepalive timeout mismatch?}
F -->|Yes| G[Align backend and nginx timeouts]
F -->|No| H{Protocol mismatch?}
H -->|Yes| I[Switch proxy_pass to https]
H -->|No| J[Check request size limits]Common causes
| Cause | What it looks like | First thing to check |
|---|---|---|
| Backend application crash or restart | 502s spike abruptly and correlate with backend deploys or OOM kills. | Application logs and process restart timestamps around the nginx error. |
| Backend keepalive_timeout shorter than nginx’s | Intermittent 502s on low-traffic endpoints where nginx reuses a stale connection. | Compare backend idle timeout against nginx upstream keepalive_timeout. |
| Worker process recycle or rolling update | Transient 502s during nginx reloads or backend deployments that resolve within seconds. | Whether errors align with nginx -s reload events or backend pod restarts. |
| Backend request or response size limits | 502s triggered by large POST bodies or heavy responses; backend closes connections exceeding limits. | Backend logs for payload or size-related errors. |
| HTTP to HTTPS protocol mismatch | proxy_pass http:// sent to an HTTPS-only backend causes immediate connection close. | proxy_pass scheme matches the backend listener protocol. |
Quick checks
# Check error log for the exact message and timestamp
grep "upstream prematurely closed connection while reading response header" /var/log/nginx/error.log | tail -20
# Check recent 502 responses in the access log
grep ' 502 ' /var/log/nginx/access.log | tail -20
# Probe backend ports directly from the nginx host
for backend in 10.0.1.10:8080 10.0.1.11:8080; do
timeout 2 bash -c "echo > /dev/tcp/${backend%:*}/${backend#*:}" 2>/dev/null && echo "$backend UP" || echo "$backend DOWN"
done
# Verify nginx upstream keepalive configuration
nginx -T 2>/dev/null | grep -A5 -E "upstream|keepalive|keepalive_timeout|keepalive_requests"
# Check for recent nginx reload events
grep "reconfiguring" /var/log/nginx/error.log | tail -10
# Check for recent OOM kills or segfaults that coincide with 502s
dmesg -T | grep -iE "killed process|segfault|oom" | tail -10
How to diagnose it
- Isolate the failing backend. Parse
$upstream_addrin access logs for 502 responses. If one server dominates, investigate it first. - Check backend process health. Look for application crashes, OOM kills, or worker restarts in backend logs that match the nginx error timestamps.
- Correlate with deployments or reloads. If the errors started within seconds of an
nginx -s reloador a backend rolling update, the cause is likely connection recycling. Check error.log for reload notices. - Test keepalive alignment. Look at
$upstream_connect_timein access logs. Near-zero values indicate keepalive reuse. If 502s occur on connections with near-zero connect time, the backend likely closed the socket while it was idle. - Inspect payload sizes. Check
$request_lengthand$body_bytes_sentfor failed requests. If 502s only appear above a size threshold, the backend may enforce a payload limit. - Verify the proxy scheme. Ensure
proxy_passuseshttps://if the upstream expects TLS. Usinghttp://against an HTTPS listener causes the backend to close the connection immediately.
Metrics and signals to monitor
| Signal | Why it matters | Warning sign |
|---|---|---|
| 502 response rate | Direct measure of user impact from this error. | >1% sustained or a sudden spike correlating with the error log. |
Upstream connect time ($upstream_connect_time) | Distinguishes new connections from reused keepalive connections. | 502s occurring at near-zero connect time indicate a keepalive reuse failure. |
Upstream response time ($upstream_response_time) | Shows whether the backend slowed before closing the connection. | P95 trending up before 502 spikes suggests backend degradation. |
| Writing connections (stub_status) | Reveals connections stuck waiting for upstream responses. | Writing count climbing while request rate stays flat. |
| Accepts vs handled gap | Rules out connection exhaustion that can masquerade as upstream failure. | Gap growing while 502s occur means nginx is also dropping connections. |
| Backend process restart rate | Links 502s to application-level crashes or recycling. | Restarts correlating with nginx error timestamps. |
Fixes
Align keepalive timeouts
If the backend closes idle connections faster than nginx expects, nginx hands dead sockets to workers.
- Reduce nginx’s upstream
keepalive_timeoutto be shorter than the backend’s idle timeout. - Alternatively, increase the backend’s idle timeout if you control it.
- Ensure the
keepalivedirective is present in the upstream block to enable pooling.
Tradeoff: Shorter timeouts reduce reuse efficiency and increase TCP handshake overhead. Longer timeouts hold file descriptors open.
Enable proper upstream keepalive
A missing or misconfigured keepalive pool forces new connections, but it can also cause mismatches if nginx holds connections the backend forgot.
- Add
keepalive <count>;to the upstream block. - Set
proxy_http_version 1.1;in the location block. - Set
proxy_set_header Connection "";so nginx does not sendConnection: closeto the upstream.
Tradeoff: Keepalive pools consume memory and file descriptors per worker. Size the pool for your traffic.
Fix backend crashes and resource limits
- Check application logs for unhandled exceptions, OOM kills, or worker pool exhaustion.
- Increase backend memory, worker counts, or payload limits if crashes are load-related.
- Use health checks (nginx Plus or a sidecar) to remove unhealthy backends before traffic hits them.
Tradeoff: Aggressive health checks add probe traffic and may hide intermittent issues.
Handle reload and deployment transients
During rolling updates or nginx reloads, in-flight keepalive connections may be closed.
- Ensure
proxy_next_upstream erroris configured so nginx retries on another backend. - Set
worker_shutdown_timeoutto prevent old workers from lingering indefinitely. - In Kubernetes, use a pre-stop sleep to let connections drain before the backend container exits.
Tradeoff: Retries add latency for the affected request but improve perceived availability.
Verify proxy scheme
If proxy_pass uses http:// but the upstream requires TLS:
- Change to
proxy_pass https://<upstream>;. - Ensure the backend certificate is trusted if nginx verifies it.
Tradeoff: HTTPS upstreams add TLS overhead. Keepalive is essential to amortize handshake cost.
Prevention
- Log and alert on the exact error string. A single line is noise; a rate increase is a signal.
- Standardize keepalive timeouts so nginx always has a shorter upstream
keepalive_timeoutthan the backend. - Monitor upstream connect time. A drop in keepalive reuse efficiency predicts this error before it floods logs.
- Monitor backend process health independently of nginx. nginx only knows the backend closed the socket; it cannot see why.
- Set
worker_shutdown_timeoutto force old workers to exit and avoid holding connections that backends have cleaned up. - Audit
proxy_passschemes in configuration reviews. HTTP to HTTPS mismatches are easy to overlook.
How Netdata helps
- Correlate nginx 502 rate with upstream response time to see if the backend slowed before closing connections.
- Track active connection Writing states to detect upstream stalls that precede premature closes.
- Monitor backend process restarts and OOM kills on the same timeline as nginx 502 spikes.
- Alert on nginx error log patterns matching the exact “prematurely closed connection” string.
- Track file descriptor usage and connection slot utilization per worker to rule out nginx-side exhaustion.
Related guides
- How NGINX actually works in production: a mental model for operators
- nginx 502 Bad Gateway: causes and how to fix it
- nginx 504 Gateway Time-out: causes and fixes
- NGINX active connections climbing: reading, writing, waiting explained
- nginx connect() failed (111: Connection refused) while connecting to upstream
- NGINX connection exhaustion: detection, diagnosis, and prevention
- NGINX dropped connections: the accepts vs handled gap
- NGINX monitoring checklist: the signals every production server needs
- NGINX monitoring maturity model: from survival to expert
- nginx no live upstreams while connecting to upstream: what it means
- NGINX slowloris and slow-client attacks: detection and mitigation
- nginx: too many open files - diagnosing file descriptor exhaustion







