NGINX $request_time vs $upstream_response_time: isolating where latency lives
P95 $request_time doubles. The default assumption is upstream degradation, so you scale the backend, tune the database, and add instances. The latency barely moves. The bottleneck was a slow mobile client, proxy temp-file disk I/O, or a large request body on a lossy network. This is the most common nginx misdiagnosis.
$request_time measures the full cycle from the first byte read from the client to the last byte sent to the client. It includes reading the request, waiting for the upstream, and writing the response. $upstream_response_time measures only the backend portion, from establishing the upstream connection to receiving the last byte of the response body. The gap between them is where client-side, network, and nginx-internal delays live. To avoid chasing phantom backend problems, log both variables and compare them.
What these variables actually measure
$request_time is the total duration from when nginx reads the first byte from the client until it sends the last byte of the response body. For proxied traffic:
$request_time = client read time + upstream wait + client write time
Slow clients, large uploads, mobile networks, and disk I/O inside nginx all inflate this number independently of upstream health.
$upstream_response_time is the time from establishing a connection to the upstream server through receiving the last byte of the response body. It isolates backend latency. If this value is high and close to $request_time, the backend is the bottleneck. If $request_time is much higher, the problem is elsewhere.
$upstream_header_time, introduced in nginx 1.7.10, measures from connection establishment to the first byte of the upstream response header. Comparing it to $upstream_response_time tells you whether the backend is slow to process the request or slow to transfer the response body.
$upstream_connect_time, introduced in nginx 1.9.1, measures only the TCP or TLS handshake to the upstream. A value of 0.000 means the request reused a connection from the keepalive pool. DNS resolution time is not included; nginx resolves hostnames asynchronously before this phase begins.
For requests that never reach an upstream, such as cache hits, static files, or nginx-generated errors, the $upstream_* variables log as -.
sequenceDiagram
participant C as Client
participant N as NGINX
participant U as Upstream
C->>N: First byte
Note over C,N: $request_time starts
N->>U: Establish connection
Note over N,U: $upstream_connect_time
U->>N: First header byte
Note over N,U: $upstream_header_time
U->>N: Last body byte
Note over N,U: $upstream_response_time
N->>C: Last byte
Note over C,N: $request_time endsHow to read the gap
client_and_nginx_overhead = $request_time - $upstream_response_time
If the gap is small and both metrics rise together, the backend is the bottleneck.
If the gap is large and $upstream_response_time is low, the delay is on the client side or inside nginx. Common culprits are slow consumer networks, request body uploads taking too long, or proxy buffer spill to disk.
Use $upstream_header_time to narrow it further:
- If $upstream_header_time dominates $upstream_response_time, the backend application is slow to generate the response.
- If $upstream_response_time is much larger than $upstream_header_time, the delay is in transferring the response body from the backend to nginx, not in processing.
Where the gap comes from in production
Slow clients on a fast backend. A backend can respond in 50 ms while a mobile client on a slow network takes 5 seconds to download the payload. $upstream_response_time shows 0.05 s while $request_time shows 5 s. Without logging both variables, this looks like a server-side latency regression.
Proxy buffer spill to disk. When an upstream response exceeds the configured proxy buffers, nginx writes the overflow to temporary files under proxy_temp_path. The disk I/O adds latency that appears only in $request_time, not in $upstream_response_time. If you see a growing gap on large responses, check for temp-file creation and disk saturation on the nginx node.
Large request bodies. A client uploading a multi-megabyte payload over a slow link keeps $request_time high even though the upstream is idle waiting for nginx to finish reading the body. $upstream_response_time remains near zero until the request is finally forwarded.
Upstream retries. When nginx retries a request, $upstream_response_time can contain multiple comma- or colon-separated values. A log parser that treats the field as a single float will silently corrupt aggregations. Split on delimiters before converting.
Cached and static responses. For cache hits or static files served directly by nginx, $upstream_response_time is logged as -. Dashboards that average upstream latency across all requests without filtering out these lines produce misleading aggregates.
Operator gotchas
Proxy buffering changes timer behavior. With proxy_buffering on, the default, nginx reads the full upstream response as fast as possible into memory or disk, so $upstream_response_time ends early. $request_time continues until the client finishes downloading. With proxy_buffering off, the two timers are closer, but $request_time still includes client send time and will always be greater than or equal to $upstream_response_time.
The multi-value trap. When nginx tries multiple upstreams, the field contains delimited values. Any log parser doing a straight numeric conversion on this field will fail or produce garbage on retry events.
Version availability. $upstream_header_time requires nginx 1.7.10 or later. $upstream_connect_time requires 1.9.1 or later. If you are running an older build, these variables are empty.
Signals to watch in production
| Signal | Why it matters | Warning sign |
|---|---|---|
| $request_time minus $upstream_response_time | Reveals client-side, network, and disk I/O delays that the backend cannot fix | Gap increases while $upstream_response_time stays flat |
| $upstream_response_time P95 | Backend latency isolated from client effects | Sustained increase >20% from baseline |
| $upstream_header_time | Distinguishes slow application processing from large body transfer | Header time dominates total upstream response time |
| $upstream_connect_time | Connection establishment overhead and keepalive pool efficiency | P95 > 100 ms without network reason, or values rarely 0.000 |
| Multi-value upstream times | Indicates nginx retried across multiple upstreams | Field contains commas or colons |
How Netdata helps
- Parses access logs to chart $request_time and $upstream_response_time together, surfacing the gap directly.
- Correlates nginx timing with disk I/O to flag proxy buffer spill.
- Tracks $upstream_connect_time to surface keepalive pool inefficiency.
- Alerts on upstream latency percentiles independently of total request time.
- Cross-references 502/504 rates with upstream time to distinguish backend failures from client delays.
Related guides
- How NGINX actually works in production: a mental model for operators
- nginx 502 Bad Gateway: causes and how to fix it
- nginx 503 Service Temporarily Unavailable: causes and fixes
- nginx 504 Gateway Time-out: causes and fixes
- NGINX active connections climbing: reading, writing, waiting explained
- NGINX backend cascade failure: when slow upstreams take down everything
- nginx connect() failed (111: Connection refused) while connecting to upstream
- NGINX connection exhaustion: detection, diagnosis, and prevention
- NGINX dropped connections: the accepts vs handled gap
- NGINX monitoring checklist: the signals every production server needs
- NGINX monitoring maturity model: from survival to expert
- nginx no live upstreams while connecting to upstream: what it means







