$ guides / kubernetes / kubernetes-pod-exits-immediately ▌

Operations Guides

Kubernetes pod exits immediately: how to diagnose it

When a pod shows Completed or Error with zero restarts, the container exited on its first run. The diagnostic evidence lives in termination metadata, not in a growing restart count. This is distinct from CrashLoopBackOff, where the kubelet has already applied exponential backoff after multiple restarts.

This guide covers how to distinguish a clean exit, an OOM kill, an application crash, and a configuration error using only the kubelet’s reported state and the previous container logs, plus which node-level and control-plane signals to check when the container produced no logs.

What this means

When a container terminates before the kubelet restarts it, the pod phase becomes Succeeded for exit code 0, or Failed for non-zero. A Deployment defaults to restartPolicy: Always, so even a clean exit triggers an immediate restart. Under Never or OnFailure, the pod stays terminal.

Immediately after the first termination, the RESTARTS counter is still 0. The Last State: Terminated block in kubectl describe pod captures the exit code and reason from that run. If the kubelet restarts the container, the first termination state shifts into lastState while currentState becomes Running or Waiting. Capture the first exit event before that happens.

Common causes

Cause	What it looks like	First thing to check
One-shot command with `restartPolicy: Always`	Pod exits cleanly (code `0`) but immediately restarts	`kubectl describe pod` for `Last State: Terminated`, `Reason: Completed`, `Exit Code: 0`
OOMKilled	Exit code `137`, often with no application logs	`kubectl describe pod` for `Reason: OOMKilled`; node `MemoryPressure` condition
Application startup crash	Exit code `1`, stack trace or config error in logs	`kubectl logs <pod> --previous`
Missing secret, configmap, or env	Exit code `1`, `FileNotFoundError` or similar in logs	Pod events and `--previous` logs
Init container failure	Main containers never start; init exits with error	`kubectl describe pod` for init container state
Sub-second exit before log flush	Empty `--previous` logs, exit code present	Structured container status via `kubectl get pod -o jsonpath`
Node resource pressure eviction	Pod terminated by kubelet, status `Evicted`	Node conditions and `kubelet_evictions_total`

Quick checks

Run these checks in order. They are read-only and safe.

# Check pod phase and restart count
kubectl get pod <pod-name> -o jsonpath='{.status.phase}{"\t"}{.status.containerStatuses[0].restartCount}{"\n"}'

# Check termination reason and exit code
kubectl describe pod <pod-name> | grep -A 5 "Last State:"

# Retrieve logs from the terminated container instance
kubectl logs <pod-name> --previous

# Extract structured container status including exit code and reason
kubectl get pod <pod-name> -o jsonpath='{.status.containerStatuses[0].lastState.terminated}' | jq

# Check node-level pressure conditions
kubectl describe node <node-name> | grep -E "MemoryPressure|DiskPressure|PIDPressure"

# Check for kernel OOM events on the node
dmesg | grep -i "out of memory"

# Inspect restartPolicy and container command
kubectl get pod <pod-name> -o yaml | grep -A 2 "restartPolicy"

# Check if pod was evicted
kubectl get pod <pod-name> -o jsonpath='{.status.reason}'

For a healthy long-running pod, expect Running phase, restartCount: 0, and no Last State: Terminated. Bad output depends on the cause: Exit Code: 0 with Completed suggests a one-shot job misconfigured with restartPolicy: Always; Exit Code: 137 with OOMKilled signals memory pressure; empty logs with a non-zero exit code means the process crashed before flushing buffers.

How to diagnose it

Confirm a first-run exit. Run kubectl get pod <name>. If RESTARTS is 0 and the phase is Succeeded, Failed, or Error, the container exited on its first run. If the count is 1 or higher, the kubelet has already restarted it; treat that as a CrashLoopBackOff pattern instead.
Capture termination metadata immediately. Run kubectl describe pod <name> and look under Last State: Terminated. Record the Reason (Completed, Error, OOMKilled), Exit Code, and Message. If the pod has already been restarted, this block may have shifted. Query it directly with kubectl get pod <name> -o jsonpath='{.status.containerStatuses[*].lastState.terminated}'.
Interpret the exit code.
- 0: The process exited cleanly. If the pod is restarting, check whether restartPolicy is Always when it should be Never or OnFailure.
- 1: Generic application error. Look for stack traces or configuration failures in kubectl logs --previous.
- 137 (128 + 9): The process received SIGKILL. In Kubernetes, this almost always means OOMKilled when it appears in container status. Cross-check with node memory pressure and container limits.
- 143 (128 + 15): The process received SIGTERM. This is normal during graceful shutdown but unexpected on startup.
Retrieve logs from the terminated instance. Run kubectl logs <pod-name> --previous. Empty output means the container exited before writing to stdout or stderr, or the runtime buffers were not flushed. Rely on termination metadata and node-level signals instead.
Check for node-level pressure. Run kubectl describe node <node-name> and look at conditions. MemoryPressure=True means the kubelet is evicting pods. DiskPressure=True can prevent image pulls or log writes. Check kubelet_evictions_total metrics for the specific eviction signal.
Inspect init container state. If the pod is stuck in Init:Error, the init container exited immediately. Run kubectl logs <pod-name> -c <init-container-name> to see why. Main containers will not start until all init containers complete successfully.
Correlate with cluster events. Run kubectl get events --field-selector involvedObject.name=<pod-name>. Look for FailedScheduling, FailedMount, FailedCreatePodSandBox, or Killing events that preceded the exit. A Killing event from the kubelet indicates an eviction or termination signal, not an application crash.
Compare the API server state to the original spec. Run kubectl get pod <pod-name> -o yaml and compare it against the manifest that created it. Silent mutations, defaulted fields, or injected sidecars can change the effective container command or environment.

flowchart TD
  A[Pod exits immediately
RESTARTS: 0] --> B{Exit code?}
  B -->|0| C[One-shot job with
restartPolicy: Always?]
  B -->|1| D[Application error
Check logs --previous]
  B -->|137| E[OOMKilled or SIGKILL
Check memory limits
and node pressure]
  B -->|143| F[SIGTERM on startup
Check preStop hooks
and grace period]
  C -->|Yes| G[Change restartPolicy
to Never or OnFailure]
  C -->|No| H[Check for expected
clean completion]
  D --> I[Fix code, config,
or missing secrets]
  E --> J[Raise limits or
reduce memory usage]
  F --> K[Adjust shutdown
behavior]

Metrics and signals to monitor

Signal	Why it matters	Warning sign
Pod phase distribution	Reveals pods terminating outside normal churn	Sustained increase in `Failed` or `Succeeded` pods that should be `Running`
Container restart count	Lagging indicator of prior exits	Restart count increasing for a stable workload
`lastState.terminated.reason`	Distinguishes OOM, error, and clean completion	`OOMKilled` or `Error` in terminal state
Node `MemoryPressure` condition	Triggers kernel OOM or kubelet eviction	`MemoryPressure=True` on production nodes
Container memory working set vs limit	OOM occurs when usage exceeds the cgroup limit	Working set within 10% of the memory limit
`kubelet_evictions_total`	Kubelet evicts pods to reclaim resources	Any eviction event for non-best-effort workloads
`kubelet_pleg_relist_duration_seconds`	Slow PLEG delays state reporting to the API server	p99 relist duration above 5 seconds
API server mutating request latency	Slow admission or etcd delays status updates	p99 mutating latency above 1 second sustained

Fixes

If the cause is a one-shot job with `restartPolicy: Always`

Change the pod or Deployment restartPolicy to Never for one-shot jobs, or use a Kubernetes Job object which defaults to OnFailure. A container that exits cleanly with code 0 will still be restarted under Always.

If the cause is OOMKilled

Increase the container memory limit, or reduce the application’s memory footprint. For Java applications, ensure the max heap size leaves headroom for native memory and the container overhead. If the node itself is under MemoryPressure, scale the node pool or evict heavy best-effort pods.

If the cause is an application startup error

Read kubectl logs --previous to find the stack trace, missing file, or configuration failure. Verify that ConfigMaps, Secrets, and environment variables referenced in the pod spec exist and are mounted correctly. Fix the application code or container image.

If the cause is an init container failure

Run kubectl logs <pod> -c <init-container> to capture the init container’s output. Fix the initialization script, dependency, or command. Init container restarts are counted separately and can block the main pod indefinitely.

If the cause is node resource pressure

Warning: Disruptive. Cordon prevents new pods from scheduling to the node.

Cordon the node, then free disk space, remove unused images, or add nodes to the pool. Set resource requests and limits on all workloads so the scheduler and kubelet can make informed eviction decisions.

If the cause is missing log output

If the container exits before flushing logs, add a log flush call at application startup as a temporary debugging measure. Alternatively, write a termination message to the termination message path so kubectl describe pod surfaces it without relying on log buffers.

Prevention

Match restartPolicy to workload type. Use Always for long-running services, OnFailure for batch jobs, and Never for one-shot tasks.
Set memory requests and limits. This prevents the kernel OOM killer from targeting containers unpredictably and gives the scheduler the data it needs.
Use startup probes for slow-starting containers. Do not use liveness probes to catch startup failures; a failing liveness probe on a container that is still initializing causes unnecessary restarts.
Monitor pod phase distribution and container restart counts. Baseline these metrics per workload so you can detect a sudden shift to Failed or Succeeded.
Write termination messages. Configure terminationMessagePath and terminationMessagePolicy so application fatal errors are surfaced in kubectl describe pod even when logs are empty.
Include kubectl logs --previous in runbooks. Operators should run this immediately after detecting an unexpected exit, before the kubelet restarts the container and the evidence shifts.

How Netdata helps

Netdata collects kubelet metrics such as kubelet_running_pods, kubelet_container_start_duration_seconds, and kubelet_evictions_total to correlate pod exits with node-level events.
Per-container cgroup memory charts show working set growth approaching the limit before the OOM killer triggers.
Node condition alerts for MemoryPressure and DiskPressure trigger before the kubelet begins evicting pods.
API server latency monitoring detects slow admission webhooks or etcd disk latency that delays pod status updates and masks the true timing of a container exit.

See Kubernetes eviction cascade: when one node failure takes down the cluster for node-pressure cascades.
See Kubernetes kubelet memory leak: detection and OOM cycle for kubelet-level OOM patterns.
See Kubernetes kubelet not responding: PLEG, runtime, and certificate issues when node-level health is the root cause.
See Kubernetes DNS resolution failures inside pods if the exit is caused by a failing dependency.
See Kubernetes API server slow or unresponsive: causes and fixes when control plane latency delays pod lifecycle reporting.

The Netdata solution

Kubernetes monitoring with Netdata

Netdata monitors Kubernetes with per-second metrics across the control plane, nodes, and every pod, with ML anomaly detection and zero per-pod configuration. Correlate API-server and etcd latency, kubelet PLEG stalls, scheduling pressure, and OOMKills in one place.

See Kubernetes monitoring → Start monitoring free

Kubernetes pod exits immediately: how to diagnose it

Kubernetes pod exits immediately: how to diagnose it

What this means

Common causes

Quick checks

How to diagnose it

Metrics and signals to monitor

Fixes

If the cause is a one-shot job with restartPolicy: Always

If the cause is OOMKilled

If the cause is an application startup error

If the cause is an init container failure

If the cause is node resource pressure

If the cause is missing log output

Prevention

How Netdata helps

Related guides

Kubernetes monitoring with Netdata

If the cause is a one-shot job with `restartPolicy: Always`