Docker disk space full: how to troubleshoot /var/lib/docker
You notice deployments failing with “no space left on device,” image pulls hanging, or the Docker daemon becoming sluggish. On a Docker host, everything lives under /var/lib/docker: image layers, container writable layers and logs, named volumes, and build cache. When this filesystem fills, the failure is cascading and abrupt. New containers cannot start, running containers may fail on writes, and daemon operations deadlock.
This guide walks through identifying which of the five major consumers is dominating your disk, safely reclaiming space without deleting data you still need, and fixing the configuration gaps that let it happen again.
What this means
Docker stores all runtime data under /var/lib/docker. On a modern host using the overlay2 storage driver, the key subdirectories are:
- overlay2/: image layers, container writable layers, and BuildKit snapshotter data
- containers/: container metadata and json-file logs
- volumes/: named and anonymous volume data
- image/: image manifests and configuration JSON
- buildkit/: BuildKit cache and build state
If your storage driver is not overlay2, the path may differ. Note that devicemapper was removed in Docker Engine v25.0; modern hosts should use overlay2.
The docker system df command breaks usage into four categories: Images, Containers, Local Volumes, and Build Cache. Each shows a SIZE and a RECLAIMABLE column. RECLAIMABLE is what Docker knows it can safely delete, but this number does not include volumes unless you explicitly target them, and it may undercount orphan overlay2 layers that have lost their metadata references.
When /var/lib/docker fills, there is no graceful degradation. The daemon may hang during storage operations, container creates fail, and json-file logs cannot rotate because there is no space to create new files.
Common causes
| Cause | What it looks like | First thing to check |
|---|---|---|
| Unbounded container logs (json-file driver) | /var/lib/docker/containers/ grows rapidly; individual *-json.log files in the gigabytes | find /var/lib/docker/containers/ -name "*-json.log" -exec ls -lh {} \; |
| Dangling or unused images | Image count is high but active container count is low; docker system df Images line shows large reclaimable | `docker images –format ‘{{.Size}}\t{{.Repository}}:{{.Tag}}’ |
| Orphaned volumes | Volume usage grows even after containers are removed | docker volume ls and du -sh /var/lib/docker/volumes/*/ |
| Build cache accumulation (CI/CD hosts) | Build Cache line or /var/lib/docker/buildkit/ is tens of gigabytes | docker system df Build Cache line; docker builder prune --dry-run |
| Application writing to container filesystem | Container writable layers grow because data is not on a volume | docker ps -a --size --format '{{.Names}}\t{{.Size}}' |
| Orphan overlay2 layers | du shows large overlay2/ but docker system df reports 0B reclaimable | `ls /var/lib/docker/overlay2/ |
Quick checks
# Check filesystem utilization for the Docker data directory
df -h /var/lib/docker
# Docker's own accounting of space by category
docker system df
# Verbose per-object breakdown (slow on large hosts)
docker system df -v
# Host-level directory sizes to find the dominant subdirectory
du -sh /var/lib/docker/*/
# Container log file sizes (json-file driver)
find /var/lib/docker/containers/ -name "*-json.log" -exec ls -lh {} \;
# Largest images by size
docker images --format '{{.Size}}\t{{.Repository}}:{{.Tag}}' | sort -hr | head -20
# Writable layer sizes for containers
# WARNING: this triggers a filesystem walk and is expensive; do not poll frequently
docker ps -a --size --format "table {{.Names}}\t{{.Size}}"
# Volume disk usage
sudo du -sh /var/lib/docker/volumes/*/
# Build cache dry-run to see reclaimable space
docker builder prune --dry-run
# Count dangling images
docker images -f "dangling=true" -q | wc -l
# Check active overlay mounts (active containers create merge mounts)
mount | grep overlay2
How to diagnose it
Confirm the filesystem is actually full. Run
df -h /var/lib/docker. If the backing filesystem is at 100%, every write operation will fail. If it is separate from the root filesystem, the host may look healthy while Docker is dead.Get Docker’s accounting. Run
docker system df. Compare the SIZE and RECLAIMABLE columns across Images, Containers, Local Volumes, and Build Cache. If RECLAIMABLE is large relative to SIZE, you have tracked garbage that Docker can remove safely.Cross-check with host-level tools. Run
du -sh /var/lib/docker/*/. This tells you which subdirectory is the largest. Note thatducan double-count overlay2 layers because stacked overlay mounts include lowerdir data, so treat overlay2/ numbers as directional, not exact. If containers/ is the largest, logs are likely the culprit. If volumes/ dominates, check orphaned database or data volumes.Inspect container logs. If containers/ is large, run
find /var/lib/docker/containers/ -name "*-json.log" -exec ls -lh {} \;. A single multi-gigabyte log file from a verbose or crash-looping container is the most common silent grower. The json-file driver has no max-size limit unless configured.Audit images and dangling layers. If image storage is large, list images sorted by size. Dangling images (
docker images -f "dangling=true") are typically safe to remove, but be aware that parent images shared by multiple tagged images are protected.docker image prunewill not delete an image that serves as a parent for another image until the child is removed first.Check build cache separately. The Build Cache line in
docker system dfmaps to the legacy builder. If you use BuildKit, rundocker buildx dufor the authoritative view.docker system prunedoes not cover BuildKit-native cache; you must rundocker buildx pruneindependently.Evaluate volume bloat. Run
docker volume lsand inspect/var/lib/docker/volumes/*/. Volumes persist after their container is removed unless the container was removed with-v.docker system prune --volumesis required to remove unused volumes; without--volumes, they survive even with--all.Look for untracked overlay2 bloat. If
docker system dfreports 0B reclaimable butdushows a massive overlay2/ directory, you may have orphaned layer directories left by unclean shutdowns, parallel build races, or old storage drivers. In this state,docker system prune -afrecovers nothing.docker buildx prune -afmay recover BuildKit-related space. Full recovery of orphaned metadata sometimes requires stopping the Docker daemon and removing/var/lib/dockercontents, which is destructive and requires re-pulling all images.Review container writable layers. Run
docker ps -a --size. Containers should ideally write nothing to their writable layer. Layers growing into gigabytes indicate the application is writing logs, temp files, or data inside the container instead of to a mounted volume.
Metrics and signals to monitor
| Signal | Why it matters | Warning sign |
|---|---|---|
| Filesystem usage on /var/lib/docker | The cliff-edge failure point | >80% used |
Docker Disk Usage total (docker system df) | Tracks images, containers, volumes, and build cache in one view | RECLAIMABLE >20% of total SIZE |
| Container log file sizes | json-file logs grow unbounded without rotation | Any single *-json.log >1GB |
| Image count and dangling image count | Old images accumulate over deployments | Dangling images >100 or >10GB |
| Build cache size | CI/CD runners accumulate cache rapidly | Build cache >20GB or >20% of Docker disk usage |
| Volume usage | Orphaned volumes persist silently | Rapid growth in /var/lib/docker/volumes/ |
| Container writable layer size | Apps writing to overlay2 instead of volumes | Any container writable layer >1GB |
| Docker daemon errors | Storage driver errors precede corruption | “no space left on device” or overlay2 errors in journal |
Fixes
If logs are consuming the disk
Truncating a live json-file log is safe because the file is opened with O_APPEND. You can zero a large log immediately to recover space:
# Truncate a specific container log (safe while container is running)
truncate -s 0 /var/lib/docker/containers/<container-id>/<container-id>-json.log
Then fix the root cause. Add to /etc/docker/daemon.json:
{
"log-driver": "json-file",
"log-opts": {
"max-size": "10m",
"max-file": "3"
}
}
Restart the daemon to apply. Consider switching to the local log driver, which is more efficient and supports rotation by default. For a deeper treatment of log rotation, see Docker log rotation: preventing json-file logs from filling disk.
If unused images dominate
Targeted pruning is safer than a blanket docker system prune -af. Remove images older than a safe window:
# Remove dangling images
docker image prune -f
# Remove all unused images not referenced in 48 hours
docker image prune -a --filter "until=48h"
Remember that shared parent layers are protected. If an image is a base for others, Docker will retain it until all derived images are removed.
If volumes are consuming space
docker volume prune removes only volumes not attached to any container. If you need to remove stopped containers and their anonymous volumes together, use:
# Destructive: removes stopped containers, unused networks, dangling images, and unused volumes
docker system prune --all --volumes -f
Without --volumes, volumes survive. Always verify volume contents before pruning in production.
If build cache is the culprit
docker system prune does not clear BuildKit-native cache. Run:
# Dry-run first
docker builder prune --dry-run
# Clear legacy builder cache
docker builder prune -f
# Clear BuildKit cache
docker buildx prune -f
On Docker v25 and later, configure automatic garbage collection in daemon.json:
{
"builder": {
"gc": {
"defaultKeepStorage": "20GB"
}
}
}
If container writable layers are large
The application is writing data to the container filesystem instead of a volume. You cannot shrink a writable layer. Stop the container, remove it, and recreate it with a volume mount for the data path. Audit the application for log and temp file paths.
If overlay2 has untracked orphan layers
When docker system df shows 0B reclaimable but du shows massive overlay2/ usage, standard prune will not help. Try docker buildx prune -af first. If that fails, the remaining orphan directories have no metadata references. The last resort is to stop the Docker daemon and remove /var/lib/docker entirely, then re-pull images. This is destructive and requires planning for running containers.
Prevention
- Configure log rotation before deploying workloads. The default json-file driver has no size limit. Set
max-sizeandmax-fileindaemon.json, or use thelocaldriver. - Automate cleanup with filters, not broad prunes. Schedule
docker system prune --filter "until=48h"anddocker volume prunevia cron or systemd timer. Avoid unattendeddocker system prune -af, which removes all stopped containers and can destroy debug data. - Mount volumes for all persistent or large data. Never let applications write logs, caches, or databases to the container filesystem.
- Monitor growth rate, not just absolute usage. Alert when
/var/lib/dockerexceeds 70-80% or grows faster than 1GB per day. Cleanup at 80% is safe; cleanup at 95% may fail because prune operations need working space. - Cap build cache on CI runners. Use
daemon.jsonbuilder.gc settings on Docker v25+, or rundocker builder pruneanddocker buildx pruneas part of the CI teardown. - Test your cleanup commands. Run prune with
--dry-runor--filterduring low-risk windows to verify what would be removed before automating it.
How Netdata helps
Netdata correlates host-level disk saturation on the /var/lib/docker filesystem with container runtime signals. When disk usage spikes, correlate it with:
- Container restart counts and exit codes: Crash loops flood logs and accelerate disk growth.
- Container block I/O: Identify which container is writing heavily to its writable layer or volumes.
- Docker daemon error logs: Storage driver errors and “no space left on device” messages appear alongside disk saturation.
- Container state distribution: A rising count of exited containers indicates cleanup failure before disk pressure becomes critical.
Related guides
- Docker container high CPU usage: causes and fixes
- Docker container high memory usage: how to diagnose it
- Docker container keeps restarting: causes, checks, and fixes
- Docker container memory leak: how to find one and prove it
- Docker container running but unhealthy: how to diagnose health check failures
- Docker CPU throttling: the hidden cause of container latency
- Docker daemon not responding: how to troubleshoot a hung dockerd
- Docker DNS not working inside containers
- Docker exit code 137: OOMKilled or SIGKILL?
- Docker log rotation: preventing json-file logs from filling disk
- Docker logs taking too much disk space: how to fix log growth
- Docker monitoring checklist: the signals every production host needs




