Kubernetes node DiskPressure: detection, eviction, and recovery

A node reporting DiskPressure is actively shedding workloads. The kubelet has detected that nodefs or imagefs has crossed an eviction threshold. It is garbage collecting images, terminating pods, and applying the node.kubernetes.io/disk-pressure taint to block new scheduling. Existing pods may continue running, but any pod requiring disk for logs, emptyDir volumes, or image pulls is at risk.

Disk pressure builds predictably, unlike memory pressure. This guide covers how the kubelet evaluates disk pressure, how to distinguish nodefs from imagefs exhaustion, how to find the specific consumer, and how to recover without causing a cascading eviction loop.

What this means

The kubelet monitors two filesystems for disk pressure: nodefs (the filesystem holding /var/lib/kubelet, container logs, and usually /var/log) and imagefs (the filesystem backing the container runtime image store, typically /var/lib/containerd or /var/lib/docker). On Linux, the default hard eviction thresholds are nodefs.available < 10%, nodefs.inodesFree < 5%, and imagefs.available < 15%.

When a hard threshold is crossed, the kubelet sets DiskPressure to True, applies the disk-pressure taint, and begins evicting pods. Hard eviction ignores graceful termination periods and PodDisruptionBudgets. The eviction order prioritizes pods whose usage exceeds requests, then by pod priority, then by how much usage exceeds requests. BestEffort pods are evicted first, but Burstable and Guaranteed pods are not safe if pressure is severe.

If your cluster uses only default kubelet configuration, only hard thresholds are active. The node jumps directly to eviction with no grace period once disk crosses the default limits. Configure soft thresholds with a grace period in the kubelet configuration if you need reaction time.

If nodefs and imagefs share a partition, pressure on one means pressure on both. The kubelet triggers image garbage collection when imagefs crosses its high threshold (default 85%), removing unused images until usage drops below the low threshold (default 80%). If log accumulation or volume data is the root cause, image GC cannot help, and pod eviction becomes the only remaining mechanism.

Common causes

CauseWhat it looks likeFirst thing to check
Container log accumulation/var/log/pods or the CRI log path grows without bound; single verbose containers fill the diskdu -sh /var/log/pods/* and check for missing log rotation
Image cache bloatMany distinct image versions accumulate; imagefs nears capacity while nodefs is finecrictl images to list image sizes and tag diversity
Unbounded emptyDir volumesApplications write temp data, caches, or logs to emptyDir without sizeLimitdu under /var/lib/kubelet/pods for emptyDir growth
Orphaned pod directoriesDeleted pods leave volume or container data in /var/lib/kubelet/podsCompare directory list against running pods; check for stale CSI mount paths
Inode exhaustiondf -h shows available space but df -i shows inodes at 100%; common with Node.js node_modules or many small filesdf -i /var/lib/kubelet and find for directories with excessive file counts
Zombie processes holding deleted logsDisk usage at 100% but du cannot account for the space; logs were deleted but file handles remainlsof +L1 to find open deleted files

Quick checks

Run these from the affected node or via kubectl to confirm the condition and identify the resource under pressure.

# Check the node condition
kubectl get node ${NODE_NAME} -o jsonpath='{.status.conditions[?(@.type=="DiskPressure")]}'

# Check filesystem space for nodefs and imagefs
df -h /var/lib/kubelet
df -h /var/lib/containerd

# Check inode usage separately
df -i /var/lib/kubelet
df -i /var/lib/containerd

# Check kubelet eviction metrics via the API server
kubectl get --raw "/api/v1/nodes/${NODE_NAME}/proxy/metrics" | grep kubelet_evictions

# Find the largest log directories
du -sh /var/log/pods/* 2>/dev/null | sort -rh | head -10

# Check image disk usage
crictl images -o json | jq -r '.images[] | "\(.size) \(.repoTags[0] // "untagged")"' | sort -rn | head -10

# Find large emptyDir volumes; paths vary by CRI and storage driver
find /var/lib/kubelet/pods -type d -name "*empty-dir*" -exec du -sh {} + 2>/dev/null | sort -rh | head -10

# Check for deleted files still held open
lsof +L1 | awk 'NR>1 && $7~/^[0-9]+$/ {print $7, $1, $2}' | sort -rn | head -10

# List recently evicted pods on the node
kubectl get pods --all-namespaces --field-selector spec.nodeName=${NODE_NAME},status.phase=Failed -o json | \
  jq '.items[]? | select(.status.reason=="Evicted") | {name: .metadata.name, message: .status.message}'

How to diagnose it

  1. Confirm which filesystem is under pressure. Run df -h and df -i for both /var/lib/kubelet and the container runtime root. If they share a partition, treat them as a single pool. If they are separate, disk pressure on imagefs points toward images and overlay data, while nodefs pressure points toward logs, emptyDir volumes, and kubelet state.

  2. Determine whether it is a space or inode problem. Inode exhaustion behaves differently from byte exhaustion. Applications fail with No space left on device even when df -h shows gigabytes free. If df -i reports usage above 95%, the diagnosis path focuses on finding directories with millions of small files rather than a few large ones.

  3. Identify the largest consumers. On nodefs, use du on /var/log/pods, /var/lib/kubelet/pods, and /var/log/containers to find the top consumers. On imagefs, use crictl images to identify unexpectedly large or numerous images. If image GC is running but disk keeps filling, the consumption rate exceeds the cleanup rate.

  4. Check if image GC is keeping up. Review kubelet logs for image GC events: journalctl -u kubelet --since "1 hour ago" | grep -i "image.*gc\|garbage.*image". If GC runs but reclaims zero bytes, every image on the node may be in use by running containers, meaning you need a larger disk or fewer distinct images.

  5. Correlate evictions with specific pods. Use kubectl get events --field-selector reason=Evicted to see which pods were evicted and whether the eviction signal was nodefs.available, imagefs.available, or an inode signal. If the same workload is repeatedly evicted and rescheduled to the same node, the node may be oscillating between pressure and recovery without resolving the underlying consumer.

  6. Check for open deleted files. If disk usage is at 100% but directory sizes do not add up, run lsof +L1. Log rotation scripts that truncate without notifying the container runtime, or applications that hold file handles after deletion, can hide disk consumption from du.

  7. Evaluate kubelet configuration. Inspect the kubelet configuration for evictionHard thresholds and image GC settings. df and the kubelet may disagree on available space if the filesystem has reserved blocks (ext4 reserves 5% for root by default).

Metrics and signals to monitor

SignalWhy it mattersWarning sign
DiskPressure node conditionHard eviction starts when this is TrueAny transition to True
nodefs.available / imagefs.availableAvailable space before eviction triggers< 15% for nodefs, < 20% for imagefs
nodefs.inodesFree / imagefs.inodesFreeFile creation capacity; inode exhaustion blocks writes independently of space< 10% free
kubelet_evictions_totalCount of pods evicted by signal; confirms active pressureAny increase on nodefs.available or imagefs.available
Container log growth rateContainer stdout/stderr logs write to nodefs and can outpace rotationSustained growth above baseline, e.g., > 100 MB per hour
Image GC frequencyIndicates imagefs is nearing thresholds and cleanup is activeGC events more than once per hour
emptyDir volume usageUnbounded emptyDir consumption is a common root cause of nodefs pressureAny emptyDir without sizeLimit on a busy node
Pod startup latencyDisk pressure slows image pulls and volume mountsp99 pod start duration > 30 seconds during pressure

Fixes

If container logs are filling nodefs

Configure log rotation via the kubelet configuration file using containerLogMaxSize and containerLogMaxFiles. If the node is already under pressure and you need immediate relief, truncating logs is destructive but fast:

# Destructive: truncates container logs to free space immediately
# Verify the exact file before truncating; paths vary by CRI and runtime
> /var/log/pods/<container-log-path>

Address the root cause by fixing overly verbose application logging or reducing log retention.

If images are bloating imagefs

If image garbage collection cannot keep up, manually remove unused images through the container runtime. This is disruptive because subsequent pod starts will re-pull removed images.

Adjust imageGCHighThresholdPercent and imageGCLowThresholdPercent in the kubelet configuration if the defaults are too aggressive or too lax for your workload density. Reducing image tag diversity per node also lowers cache pressure.

If emptyDir or volumes are consuming space

Add sizeLimit to emptyDir volumes in pod specs. When an emptyDir exceeds its limit, the kubelet evicts the pod. Also set ephemeral-storage requests and limits on pods so the scheduler accounts for disk usage during placement.

For orphaned pod directories that remain after deletion, manually remove the directory only after confirming the pod is no longer running and the volume is unmounted:

# Verify the pod is gone and volumes are unmounted
mount | grep /var/lib/kubelet/pods/<pod-uid>
# Only remove if confirmed orphaned and unmounted
rm -rf /var/lib/kubelet/pods/<pod-uid>/

If inodes are exhausted

Find directories with excessive file counts:

# Find directories with the most files (requires GNU find)
find /var/lib/kubelet -xdev -printf '%h\n' | sort | uniq -c | sort -rn | head -20

Remove unnecessary cache layers, old build artifacts, or temporary files. If the filesystem was created with too few inodes for the workload, the only durable fix is to rebuild the filesystem or move the workload to a node with adequate inode capacity.

If the node is in an eviction loop

A node oscillating between DiskPressure True and False is thrashing. Cordon the node immediately to stop new scheduling:

kubectl cordon ${NODE_NAME}

Delete evicted pods to clear scheduler backpressure, then resolve the disk consumer. Do not uncordon until df shows sustained headroom well below the eviction threshold.

Prevention

  • Configure log rotation. Set containerLogMaxSize and containerLogMaxFiles in the kubelet configuration to prevent a single verbose container from filling nodefs.
  • Set emptyDir quotas. Enforce sizeLimit on emptyDir volumes and ephemeral-storage limits on pods via LimitRanges or ResourceQuotas.
  • Monitor inodes proactively. Alert on inode usage separately from disk space usage; inode exhaustion is equally disruptive and harder to diagnose.
  • Size node disks for workload diversity. Ensure imagefs can hold the working set of images for the node pool without constant GC thrashing. If nodes pull many distinct images, disk requirements grow non-linearly.
  • Clean up evicted pods. Evicted pods can persist in the API and create noise. Delete them as part of regular cluster hygiene.
  • Review kubelet eviction configuration. Verify that evictionHard thresholds match your operational expectations and that reserved blocks on ext4 filesystems are accounted for in capacity planning.

How Netdata helps

Netdata correlates node-level disk signals with workload behavior:

  • Disk space and inode utilization per mount point, with alerts before kubelet eviction thresholds are hit.
  • Container log size tracking, where available, to identify pods writing aggressively to nodefs.
  • Kubelet eviction event correlation with disk saturation charts to confirm whether nodefs or imagefs triggered the eviction.
  • Pod-level ephemeral storage usage to catch emptyDir or local volume growth before it triggers node-wide pressure.