Docker exit code 1: application errors and how to find them

A container exiting immediately with code 1 is a common production incident. Unlike exit code 137 (OOM killer) or code 125 (Docker daemon error), code 1 means the application inside the container called exit(1). Docker is only reporting what PID 1 did.

The challenge is that code 1 is a catch-all. It can mask an unhandled JavaScript exception, a Python import error, a missing configuration file, a shell script failing under set -e, or a Go binary that cannot reach its database. Time spent checking Docker daemon health or host memory is wasted when the real failure is an application-level error written to stdout or stderr.

This guide shows how to confirm code 1 is an application error, trace it to the right layer, and stop the container from crash-looping while you fix the root cause.

What this means

Docker reserves only three exit codes for Docker-layer errors: 125 (daemon error), 126 (command not executable), and 127 (command not found). Every other code, including 1, is the raw exit status returned by the container’s PID 1.

Exit code 1 is the generic “application error” status. Many language runtimes map it to an unhandled exception: Node.js fatal exceptions, Python tracebacks, Go programs calling os.Exit(1), or shell scripts exiting because a command returned non-zero. It is distinct from code 2, which usually signals shell misuse, and from code 255, which indicates an out-of-range exit status.

Because code 1 is broad, the diagnostic work is about ruling out its siblings first. Code 137 (128+9) means SIGKILL, often from the OOM killer. Code 139 (128+11) means SIGSEGV. Code 143 (128+15) means SIGTERM. Misclassifying a code 1 problem as one of these sends you after the wrong fix.

Common causes

CauseWhat it looks likeFirst thing to check
Unhandled application exceptionStack trace or error in logs; container exits immediatelydocker logs <id> for the traceback
Shell entrypoint with set -eSilent exit after an intermediate command fails; application may never startThe entrypoint script for set -e and unguarded commands
Missing configuration or secretApplication fails during initialization with a config errordocker inspect for missing Env, mounts, or secrets
Dependency unavailable (DB, API, DNS)Connection refused or timeout logged, then exit 1Network connectivity and DNS resolution from inside the container
Missing dynamic library (Alpine)Binary starts but exits 1, distinct from code 127Run ldd or file on the binary inside the container
Restart loopContainer restarts repeatedly, logs are flooded, original error is burieddocker inspect RestartCount; pause the restart policy

Quick checks

Run these read-only checks first to confirm the code and find the failure without changing container state.

# Verify exact exit code and rule out OOM
docker inspect --format '{{.State.ExitCode}} OOMKilled={{.State.OOMKilled}}' <container_id>

If the exit code is 1 and OOMKilled is false, you are looking at an application error. If OOMKilled is true and the code is 137, see the OOM troubleshooting guide instead.

# Read logs from the crashed container
docker logs --tail 500 <container_id>

If the container is restarting too fast to read logs, pause the restart policy. Warning: this stops automatic recovery and changes deployment behavior until reverted.

# Pause restarts to stop the crash loop
docker update --restart=no <container_id>

Then read the logs again.

# Check how many times it has restarted
docker inspect --format '{{.RestartCount}}' <container_id>

A nonzero and increasing RestartCount means the container is in a restart loop. The combination of restart count and exit code 1 is the signature of an unstable application.

# Inspect the entrypoint and command
docker inspect --format 'Entrypoint={{.Config.Entrypoint}} Cmd={{.Config.Cmd}}' <container_id>

Look for shell scripts that chain commands. If the entrypoint is a shell script, any unguarded command that returns non-zero can trigger an immediate exit 1 when set -e is active.

# Check how long the container ran before exiting
docker inspect --format 'Started={{.State.StartedAt}} Finished={{.State.FinishedAt}}' <container_id>

A start and finish time within the same second indicates an immediate initialization failure. A longer interval suggests the application failed after a timeout or a background check.

# Test commands interactively inside the image
docker run --rm -it --entrypoint sh <image>

From inside, run the application manually or step through the entrypoint to see exactly which command returns 1. This requires a shell in the image; distroless images will need a debug tag or an alternative approach.

How to diagnose it

Follow this flow to narrow from “code 1” to the specific failure.

  1. Confirm the exit code and OOM status. Use docker inspect to read .State.ExitCode and .State.OOMKilled. If the code is 137 and OOMKilled is true, the problem is memory pressure.
  2. Collect the container logs. Run docker logs <id>. Look for stack traces, configuration errors, or dependency connection failures. If the container restarted and overwrote the logs, pause the restart policy first.
  3. Map the failure to the runtime. Node.js containers usually dump a fatal exception. Python containers show a traceback. Go binaries may log a panic or exit silently. Java containers may fail before the JVM fully initializes.
  4. Inspect the entrypoint chain. If the image uses a shell entrypoint, check whether set -e is causing a premature exit before the main application runs. Test step by step by running the image with --entrypoint sh.
  5. Check for missing dynamic libraries. On Alpine-based images, an incompatible binary or missing library can cause exit 1 even when the binary path is correct. Run ldd <binary> inside the container to verify linking.
  6. Verify configuration and secrets. Use docker inspect to check environment variables and volume mounts. A missing required env var or unmounted config file is a common cause of initialization failures.
  7. Test network dependencies. From inside the container, run nslookup, dig, or nc against required databases and APIs. DNS failures and connection refusals often manifest as exit 1 if the application does not handle them gracefully.
  8. Correlate restart count with log timestamps. Check whether the failure occurs earlier or later in the initialization sequence across restarts. A shifting failure point indicates which subsystem is responsible.

Metrics and signals to monitor

SignalWhy it mattersWarning sign
Container exit codeDistinguishes application errors from OOM or signalsExit code 1 on production containers
Container OOM killed statusRules out memory pressure that is often confused with code 1OOMKilled=true paired with exit 137
Container restart countIdentifies crash loops before they consume disk and CPURestart count increasing faster than once per hour
Container health check statusCatches functional failures that do not kill the processStatus unhealthy while the container remains running
Application stdout/stderr logsContains the actual error message or stack traceError output immediately preceding the exit
Container start latencyRising latency can mean the app is failing later in startupStart time growing across restart attempts

Fixes

Apply the fix that matches the root cause.

If the cause is an unhandled application exception

Fix the code path or configuration causing the crash. If you need more data, run a debug build or add logging, then redeploy. Do not mask the problem by increasing the restart limit or delay.

If the cause is a shell entrypoint failing early

Override the entrypoint or edit the script to isolate the failing step. Remove set -e temporarily or add || true to non-critical commands. Ensure the final application command uses exec so signals propagate correctly to PID 1.

If the cause is a missing dependency or configuration

Mount the missing file, set the required environment variable, or fix the secret injection. Verify the host file exists at the mounted path, then test by running the container interactively before pushing to production.

If the cause is a missing dynamic library

Rebuild the image with the correct base image or install the missing library. Verify with file and ldd inside the container before declaring the image ready.

If the cause is a restart loop masking the error

Pause the restart policy with docker update --restart=no <id>. This stops log flooding and lets you inspect the container in its exited state. Fix the root cause, then restore the original policy.

Prevention

  • Configure log rotation with max-size and max-file limits. Crash loops that exit with code 1 can fill disk quickly.
  • Add meaningful health checks to every production container. A process that stays up but cannot serve requests should be caught by health checks, not just exit codes.
  • Monitor exit codes and restart counts proactively. Surface alerts before disk-full or CPU alerts reveal a crash loop.
  • Test entrypoint scripts carefully. If you use set -e, add explicit error logging so the failing command is obvious.
  • For Alpine or statically linked images, verify binary compatibility in the build pipeline with ldd or file to avoid runtime linking failures.

How Netdata helps

Netdata tracks container-level signals to help distinguish code 1 from resource issues.

  • Tracks container exit codes and restart counts to flag crash loops.
  • Monitors OOM kill events and memory usage to rule out memory pressure when code 1 appears.
  • Monitors container health check status to catch failures that do not kill the process.
  • Alerts on rising container start latency, which can indicate initialization sequences that fail deeper on each attempt.
  • Tracks disk utilization from container logs alongside restart events to flag crash loops that risk filling disk.