How NGINX actually works in production: a mental model for operators
NGINX is not a multi-threaded server that spawns a thread per connection. It is an event-driven, non-blocking, single-threaded-per-worker process architecture. If you are debugging a production incident where connections are timing out, memory is climbing, or CPU is pinned, this architecture is the lens through which every symptom must be interpreted.
Most production issues involving NGINX are not NGINX bugs. They are resource accounting problems: file descriptors, connection slots, buffer boundaries, or event loop latency. An operator who understands the internal mechanics can read the signals correctly instead of chasing phantom upstream problems or adding hardware that hits the same limit.
After reading this article, you will understand how the master process, worker event loops, connection state machine, upstream keepalive pools, and the proxy connection multiplier interact. You will know why capacity in NGINX is a set of hard cliffs rather than graceful curves, and which signals predict each cliff.
What it is and why it matters
NGINX uses a master/worker process model. The master process reads configuration, binds to listening ports, and spawns worker processes. It handles signal-based lifecycle events such as reload, upgrade, and shutdown. The master never processes client traffic directly. It holds the listening sockets open and passes them to workers via fork inheritance.
Worker processes handle all traffic. Each worker runs a single-threaded event loop. On Linux this loop uses epoll; on BSD it uses kqueue. The worker registers interest in socket events with the kernel, then iterates through ready events without blocking. This design allows one worker to manage tens of thousands of concurrent connections, but it also means that any blocking operation stalls every connection on that worker.
The mental model matters because capacity limits in NGINX are hard cliffs. When worker_connections is exhausted, new connections drop immediately. When file descriptors are exhausted, accept() fails with EMFILE. When the event loop stalls because of synchronous disk I/O or a slow system call, every connection on that worker slows down. Understanding these boundaries prevents misdiagnosing a connection limit as an upstream outage.
This architecture also means that monitoring must happen at multiple levels. The worker sees only what the kernel delivers through the event loop. Kernel-level drops, such as listen queue overflows, are invisible to NGINX logs entirely. A complete mental model requires looking at both the process internals and the surrounding OS boundary.
How it works
The architecture has four interacting layers: the master process, the per-worker event loop, the connection state machine, and upstream connection pooling.
flowchart TD
A[Master Process] -->|fork + inherit sockets| B[Worker 1]
A -->|fork + inherit sockets| C[Worker 2]
A -->|fork + inherit sockets| D[Worker N]
B -->|epoll / kqueue| E[Event Loop]
C -->|epoll / kqueue| F[Event Loop]
D -->|epoll / kqueue| G[Event Loop]
E --> H[Connection States
Reading / Writing / Waiting]
F --> H
G --> H
H --> I[Upstream Keepalive Pool
per worker]
H --> J[Client Connection]
I --> K[Upstream Server]Master process. At runtime the master does almost nothing except manage worker lifecycle. It binds to ports before spawning workers so that workers inherit the listening file descriptors. On reload, triggered by HUP signal or nginx -s reload, the master spawns new workers with the updated configuration while old workers continue serving existing connections until they drain. A failed reload does not stop NGINX; the previous configuration stays active, which means configuration drift can go undetected if you do not verify that new workers spawned.
Worker event loop. Each worker competes to accept new connections from the shared listening sockets. The accept_mutex governs which worker picks up the next connection. With reuseport, available since NGINX 1.9.1, the kernel distributes connections across per-worker sockets instead. Inside the loop, the worker never blocks. It registers interest in the next I/O event and moves to the next ready connection.
Because the worker is single-threaded, a long TLS handshake, a gzip compression of a large response, or a disk write for a temporary file occupies the worker until completion. There are no background threads to absorb these tasks. This is why CPU profiling and disk I/O latency on the NGINX node matter as much as upstream health.
Connection state machine. Every accepted connection moves through discrete states: reading request headers, reading request body, processing (proxying, serving static files, executing subrequests), writing response headers, writing response body, and keepalive idle. Each state transition is an event. A connection in Reading is receiving data from the client. A connection in Writing is sending a response or waiting for an upstream response. A connection in Waiting is idle on keepalive between requests.
The stub_status module exposes these states as Reading, Writing, and Waiting counts. High Writing with elevated upstream response time means the upstream is slow. High Reading with low throughput means clients are sending slowly, or a slowloris-style attack is consuming slots. High Waiting is usually healthy connection reuse, but it still consumes a connection slot and a file descriptor.
Buffer management. NGINX uses fixed-size per-connection buffers. Request headers are read into client_header_buffer_size, with large_client_header_buffers for overflow. Request bodies use client_body_buffer_size. Proxy responses use proxy_buffer_size for headers and proxy_buffers for the body. When these buffers overflow, NGINX spills to temporary files on disk. This is silent in the error log and produces latency spikes that do not correlate with upstream response time. The gap between request_time and upstream_response_time in the access log often reveals this disk I/O overhead.
Upstream keepalive pools. The keepalive directive in an upstream block maintains persistent connections to backends. The pool is per-worker, not global across all workers. Without keepalive, every proxied request opens a new TCP connection and, if the upstream uses HTTPS, a new TLS handshake. With keepalive, the worker reuses idle connections from its own pool. Total idle capacity equals keepalive size multiplied by worker_processes. Because the pool is not shared, a worker under heavy load cannot borrow an idle connection from a peer worker.
The 2x proxy connection multiplier. When NGINX acts as a reverse proxy, each proxied request consumes at least two connections: one from the client to NGINX and one from NGINX to the upstream. Both connections occupy a slot in worker_connections and both consume a file descriptor. Effective proxy capacity is at most half the configured worker_connections value, minus keepalive overhead on both sides. The default worker_connections of 512 means a maximum of roughly 256 simultaneous proxied requests per worker before the cliff.
Resource accounting. NGINX competes for several resources that each have hard limits:
- File descriptors: one per client connection, one per upstream connection, one per open log file, one per temp file. The ceiling is the lower of worker_rlimit_nofile and the OS ulimit. Default system limits are often dangerously low.
- Connection slots: pre-allocated per worker via worker_connections. The default is 512.
- Memory: per-connection buffers, SSL buffers (ssl_buffer_size defaults to 16 KB per connection), and shared memory zones for rate limiting, caching, and session state.
- CPU: TLS handshakes dominate at high connection rates. Gzip compression, regex evaluation in location blocks, and module execution also consume cycles.
- Disk I/O: access logging is synchronous by default. Temp file spooling for large request or response bodies adds I/O latency.
- Network: ephemeral port exhaustion can occur for upstream connections when keepalive is not configured or is ineffective.
Where it shows up in production
The architecture behaves differently depending on the deployment pattern.
Standalone HTTP server. Fewer failure modes. Focus on connection handling, static file serving, and file descriptor limits. Standalone static file serving is the simplest case. The primary risks are file descriptor exhaustion when serving many small files, and disk I/O latency if the storage subsystem stalls. The connection multiplier does not apply, so worker_connections maps more directly to client capacity.
Reverse proxy / load balancer. Upstream health is the dominant concern. Proxy buffers, timeouts, keepalive pools, and the connection multiplier are critical. The most common failure pattern is backend cascade failure: upstreams slow down, workers hold connections waiting, connection slots fill, and the remaining healthy backends become overloaded.
SSL termination endpoint. CPU is bound by TLS handshakes. Session cache hit rate and TLS version distribution become key metrics. A TLS handshake storm can pin workers at 100% CPU while request throughput collapses.
Caching proxy. Disk I/O and cache hit rate become primary concerns. Cache zone metadata size matters. When the cache is cold after restart, all traffic hits upstream until the in-memory index rebuilds.
Kubernetes ingress controller. Frequent configuration reloads, dynamic upstream endpoints, old worker accumulation, and health check amplification change the failure modes. Each reload spawns new workers and drains old ones; without worker_shutdown_timeout, old workers with long-lived connections can linger indefinitely.
Containerized deployments. File descriptor limits are often constrained by default. Log access patterns differ, and worker_processes auto may detect host CPU count instead of container quota on older kernels.
Tradeoffs and when this matters
worker_connections versus memory. Each connection slot allocates memory for buffers and connection state. Raising worker_connections increases concurrency but also raises per-worker memory footprint. The default of 512 is often too low for production proxy workloads, but raising it without raising worker_rlimit_nofile and the OS file descriptor limit simply shifts the bottleneck to EMFILE errors.
Per-worker keepalive pools. Because keepalive pools are not shared across workers, load imbalance between workers can leave some pools exhausted while others hold idle connections. With reuseport, the kernel pins clients to workers by source hash, which can worsen this imbalance when traffic originates from a small number of source IPs such as a Layer 4 load balancer.
Buffer sizing. Large proxy_buffers reduce disk I/O but increase memory per connection. Small buffers cause temp file spooling, which silently degrades latency. The gap between request_time and upstream_response_time in the access log reveals when buffering is the bottleneck.
Shared memory zones. Zones used for rate limiting, connection limiting, and SSL session caching have fixed sizes. When a rate limiting zone fills, NGINX stops enforcing limits for new keys. This is a silent security failure. Zone sizes cannot be changed via reload; they require a full restart.
Signals to watch in production
| Signal | Why it matters | Warning sign |
|---|---|---|
| Active connections / (worker_connections * worker_processes) | Connection slot utilization approaching the hard cliff | Sustained ratio above 0.75; above 0.9 is critical |
| Accepts minus handled (stub_status) | Dropped connections due to slot or FD exhaustion | Gap growing at any sustained rate |
| Reading / Writing / Waiting breakdown | Reveals whether load is slow clients, slow upstreams, or keepalive reuse | Reading above 30% of active sustained; Writing dominant with low throughput |
| File descriptors per worker vs limit | FD exhaustion blocks accepts, upstream connects, and log opens | Usage above 75% of limit |
| Upstream response time vs request time | Isolates backend latency from client send time and buffer spill latency | upstream_response_time is small but request_time is large |
| Upstream connect time | Connection reuse efficiency and network health to backends | Nonzero values increasing when keepalive is configured |
| Worker CPU per process | Event loop saturation from TLS, compression, or regex | Sustained above 80% of one core |
| Error log rate and severity | Leading indicator of resource exhaustion and upstream failures | Any emerg or alert; sustained error rate above baseline |
| TcpExtListenOverflows | Kernel dropping connections before NGINX sees them | Counter increasing |
How Netdata helps
- Correlates NGINX stub_status metrics (active connections, accepts, handled, requests, reading, writing, waiting) with per-worker CPU and memory to distinguish event loop saturation from upstream slowness.
- Tracks file descriptor usage per process against configured limits, surfacing EMFILE risk before connections drop.
- Monitors kernel-level TcpExtListenOverflows and socket backlog depth, exposing silent kernel drops invisible to NGINX logs.
- Plots upstream response time and connect time from access log parsing alongside request time, making the proxy buffer overhead and backend latency components visible.
- Alerts on worker process count deviations and reload events, catching old worker accumulation and failed reloads that leave stale configuration active.
Related guides
- nginx 413 Request Entity Too Large: client_max_body_size explained
- nginx 499 status code: why clients close connections before the response
- nginx 500 Internal Server Error: how to diagnose it
- nginx 502 Bad Gateway: causes and how to fix it
- nginx 503 Service Temporarily Unavailable: causes and fixes
- nginx 504 Gateway Time-out: causes and fixes
- NGINX access log performance: buffering, sampling, and the event loop
- NGINX active connections climbing: reading, writing, waiting explained
- nginx: bind() to 0.0.0.0:80 failed (98: Address already in use)
- NGINX backend cascade failure: when slow upstreams take down everything
- NGINX proxy cache hit rate is low: measuring and improving it
- nginx: configuration file test failed - finding the syntax error







