Docker exit code 143: SIGTERM and graceful shutdown failures

You are reviewing container exit codes after a deployment or node drain and see 143. If your monitoring alerts on it, you might think something failed. Exit code 143 is not an error. It is 128 plus signal 15 (SIGTERM), and it means the container’s PID 1 process received SIGTERM and exited voluntarily. This is exactly what docker stop is designed to do.

The operational problem is not the code itself, but what happens around it. If your application never receives the signal, it gets SIGKILL after the timeout and exits 137, potentially dropping connections and corrupting state. If your orchestrator pages you for 143 during a normal rolling restart, your alerts are noisy. If your stop timeout is too short for your database to finish checkpointing, you risk data inconsistency.

This guide explains how SIGTERM propagates through Docker, why it fails to reach your application, and how to fix the root cause so graceful shutdowns actually work.

What this means

Exit code 143 indicates that the container’s primary process terminated in response to SIGTERM. When you run docker stop, the daemon sends SIGTERM to the container’s PID 1, waits for a grace period (default 10 seconds), and then escalates to SIGKILL if the process is still running. A container that exits cleanly during that window reports 143.

Because Docker only signals PID 1, the identity of that process matters. If PID 1 is a shell launched by a shell-form CMD or ENTRYPOINT, the shell may not forward the signal to your application. The application keeps running until the grace period expires and SIGKILL arrives. In that case, the exit code becomes 137, not 143, and the shutdown is not graceful at all.

The same risk applies to wrapper scripts that spawn the application as a background subprocess instead of replacing themselves with exec. The signal stops at the shell, and the actual workload is blindsided.

Common causes

CauseWhat it looks likeFirst thing to check
Shell-form CMD/ENTRYPOINTdocker stop always ends in 137 after exactly 10s; app logs show no shutdown attemptdocker inspect --format '{{json .Config.Entrypoint}} {{json .Config.Cmd}}' <id>
Wrapper script without execShell remains PID 1 and ignores SIGTERM; child process never exits gracefullydocker exec <id> ps -ef to see the process tree
Missing application signal handlerApp exits 137 after timeout even with JSON-array form; no cleanup in logsApp logs after a manual docker kill --signal=SIGTERM
Stop timeout too shortContainer consistently exits 137 after N seconds where N is the configured grace periodTest with docker stop --time=60 or check orchestrator grace period
Zombie/reaping failureChild processes accumulate; container eventually hangs during shutdowndocker exec <id> ps aux for zombie states
Orchestrator misclassifying 143Alerts fire during normal deploys, node drains, or auto-scaling eventsAlert routing rules for exit code 143
Kubernetes preStop hook racingContainer gets SIGKILL before hook sleep completes; shutdown handler never finishesPod spec for preStop and terminationGracePeriodSeconds

Quick checks

# Confirm exit code and OOM status
docker inspect --format '{{.State.ExitCode}} {{.State.OOMKilled}}' <container_id>

# Inspect entrypoint and command form
docker inspect --format '{{json .Config.Entrypoint}} {{json .Config.Cmd}}' <container_id>

# Check process tree inside the container
docker exec <container_id> ps -ef

# Send SIGTERM manually without a grace timer
# Warning: this stops the container immediately if the handler works
docker kill --signal=SIGTERM <container_id>

# Test whether a longer timeout prevents SIGKILL escalation
# Warning: this stops the container
docker stop --time=60 <container_id>

# Watch die and kill events to confirm timing
docker events --filter event=die --filter event=kill --since 1m

How to diagnose it

  1. Verify the exit code and context. Use docker inspect to confirm the code is 143 and OOMKilled is false. If the code is 137, the process was killed, and the shutdown was not graceful. Check whether the stop was initiated by an operator, CI/CD pipeline, orchestrator, or node drain.
  2. Inspect PID 1. Run docker exec <id> ps -ef or docker top <id> to identify the process at PID 1. If it is /bin/sh, bash, or another shell running your application as a child, the shell is likely blocking signal propagation.
  3. Check the Dockerfile instruction form. Inspect Config.Entrypoint and Config.Cmd. If they appear as single strings rather than JSON arrays (for example, sh -c node server.js instead of ["node", "server.js"]), you are using shell form. Convert to JSON-array form or add exec to the wrapper script so the application replaces the shell.
  4. Test signal delivery manually. Send docker kill --signal=SIGTERM <id> and watch the application logs. If the app begins its shutdown routine within seconds, its handler works and the signal is reaching it. If nothing happens until you send SIGKILL, the signal is blocked or the handler is missing.
  5. Evaluate the stop timeout. If the app has a handler but needs more than 10 seconds to finish in-flight requests or flush state, docker stop will escalate to SIGKILL. Increase the timeout with docker stop --time=N, set stop_grace_period in Docker Compose, or adjust terminationGracePeriodSeconds in Kubernetes. Ensure the application’s handler has enough runway.
  6. Check for child process and zombie leaks. If the application forks workers and PID 1 does not call waitpid(), zombies accumulate. During shutdown, unreaped children can hold open resources and delay exit. Use docker run --init to inject tini as PID 1, or add an init system like tini or dumb-init to the image.
  7. Review orchestrator-side hooks and signals. In Kubernetes, the kubelet sends SIGTERM and starts the termination grace period timer at the same time. If a preStop hook sleeps to allow load-balancer deregistration, that sleep burns grace period budget. If the total time exceeds terminationGracePeriodSeconds, the container is SIGKILL’d while the hook or shutdown handler is still running. Remove unnecessary sleep from preStop or increase the grace period to cover both the hook and the application shutdown.

Metrics and signals to monitor

SignalWhy it mattersWarning sign
Container exit codeDistinguishes graceful shutdown (143) from forced kill (137) or crashes (1)Repeated 137 after docker stop or deploy events
Container restart countIdentifies crash loops caused by shutdown failuresRestarts increasing after every rolling update
Docker kill eventsReveals SIGKILL escalation due to timeoutKill events following stop events within the grace period
Application error rate / dropped connectionsShows whether in-flight work is aborted mid-shutdownSpike in 5xx errors or connection resets during deploys
Container process count / zombie countDetects PID 1 reaping failuresZombie (Z) processes increasing over time
Orchestrator termination durationMeasures actual shutdown time against the limitDuration consistently near or exceeding the grace period

Fixes

If shell form is blocking the signal

Convert CMD node server.js to CMD ["node", "server.js"] in the Dockerfile. If you must use a shell wrapper, use exec ./myapp so the application replaces the shell process and becomes PID 1.

If the application lacks a handler

Implement a SIGTERM handler in your application code. The handler should stop accepting new work, finish in-flight requests, flush buffers, and then exit. Test it by running the container locally and sending docker kill --signal=SIGTERM.

If the stop timeout is too short

Increase the timeout. For Docker CLI, use docker stop --time=N. For Docker Compose, set stop_grace_period. For Kubernetes, set terminationGracePeriodSeconds. Also align the application handler’s internal timeout to be shorter than the external limit.

If PID 1 does not reap zombies

Use docker run --init to inject Docker’s embedded tini-based init. This handles SIGTERM forwarding and zombie reaping without modifying the image. Alternatively, add tini or dumb-init to the image ENTRYPOINT.

If the orchestrator treats 143 as a failure

Update your alerting rules and runbooks. Exit code 143 is a successful graceful shutdown in response to SIGTERM. Whitelist it for stop, drain, and scale-down events. Alert on 137 (SIGKILL) or unexpected 143 outside of maintenance windows.

If preStop hooks consume the grace period

Remove artificial sleep from Kubernetes preStop hooks where possible. If you need delay for load-balancer drainage, increase terminationGracePeriodSeconds so the sum of hook duration and application shutdown fits comfortably inside the limit.

Prevention

  • Use JSON-array form for all ENTRYPOINT and CMD instructions unless you specifically need shell expansion.
  • Always use exec in wrapper scripts to replace the shell.
  • Set STOPSIGNAL explicitly in the Dockerfile if your application uses a custom shutdown signal, so operators do not have to guess.
  • Configure stop timeouts in deployment manifests, not just at runtime. Document how long your application needs to shut down cleanly.
  • Test graceful shutdown in CI: start the container, send SIGTERM, assert that it exits 143 within the expected window and that no connections are dropped.
  • Monitor for exit code 137 following stop events. That pattern means your graceful shutdown is broken even if the application eventually restarts.

How Netdata helps

  • Correlate container exit code 143 spikes with deployment events or node drains to distinguish normal shutdowns from anomalies.
  • Monitor container restart counts alongside exit codes to catch loops caused by signal propagation failures.
  • Track application-level error rates and container lifecycle events on the same timeline to identify dropped connections during rolling updates.
  • Visualize CPU and memory saturation to confirm the application had resources to complete its shutdown routine.