Docker image pull failures: registry, network, and auth diagnosis
What this means
When you run docker pull, the daemon negotiates a TLS connection to the registry, authenticates if required, resolves the manifest for the requested tag and architecture, then downloads missing layers. If any step fails, the pull aborts.
Registry errors surface as HTTP 429 or 401/403 responses. Network errors appear as timeouts, connection resets, or TLS handshake failures. Local problems such as a full disk or a hung daemon can also abort a pull even when the registry is healthy. In orchestrated environments, a single node’s pull failure can trigger the scheduler to retry on other nodes, turning a localized auth error into a cluster-wide rate limit storm. Distinguishing these layers quickly is the difference between a five-minute fix and a prolonged outage.
Common causes
| Cause | What it looks like | First thing to check |
|---|---|---|
| Registry rate limiting | HTTP 429 or “toomanyrequests”; pulls succeed during low traffic | Auth tier: anonymous vs. authenticated; rate limit headers |
| Authentication failure | “denied” or “unauthorized” when pulling private images | ~/.docker/config.json and docker login state |
| Network or DNS path failure | “connection refused”, “no such host”, or “unexpected EOF” mid-download | DNS resolution and TCP connectivity to registry endpoint |
| Missing tag or manifest | “manifest unknown” for a specific tag; other tags pull fine | Tag existence in registry; manifest architecture match |
| Local disk exhaustion | “no space left on device” during layer extraction | docker system df; df -h /var/lib/docker/ |
| TLS certificate problem | “x509: certificate signed by unknown authority” against a private registry | System CA bundle and registry certificate |
| MTU mismatch or proxy | “unexpected EOF” mid-download; first attempt fails but retries succeed | docker0 MTU vs. host interface; proxy and firewall rules |
Quick checks
Run these checks first. Favor read-only commands before making changes.
Daemon responsiveness. A hung daemon returns slowly or not at all. This distinguishes local daemon problems from registry problems.
time curl --max-time 5 --unix-socket /var/run/docker.sock http://localhost/_pingDocker disk usage. Layer extraction needs free space. A full disk can abort a pull mid-download.
docker system df df -h /var/lib/docker/Recent pull events and errors. The events stream shows requested images; daemon logs contain the full registry response.
docker events --filter event=pull --since 1h --format '{{.Time}} {{.Actor.ID}}' journalctl -u docker.service | grep -i "pull\|download\|layer"Reproduce the failure. Capture the exact error and measure latency against your baseline. This writes image layers to disk.
time docker pull <image>:<tag>Host DNS resolution. If the host cannot resolve the registry hostname, Docker cannot connect.
nslookup <registry-hostname>Authentication state. Expired or missing credentials produce auth errors that look like repository denial. This file contains secrets.
cat ~/.docker/config.jsonStorage driver status. Corrupt overlay2 metadata causes pull failures that resemble registry errors.
docker info | grep -A5 "Storage Driver"
How to diagnose it
Read the exact error from the daemon logs. The CLI output is often truncated. Look for “unauthorized” for auth issues, “connection refused” or “EOF” for network issues, “no space left on device” for disk issues, and “manifest unknown” for missing tags. Use the log line to decide which branch to follow next.
Verify local daemon and storage health. Run the
/_pingprobe anddocker system df. A hung daemon or full disk mimics registry failures. If/_pingtakes longer than one second or the disk is more than 80 percent full, fix the local condition first. A daemon that is alive but slow can drop connections mid-pull.Test registry reachability from the host. Use
curl -I https://<registry>/v2/ornc -zv <registry> 443from the host. If the host cannot connect, the problem is outside Docker: check DNS resolution, routing tables, host firewall rules, and physical links. If the host connects but Docker does not, inspect the daemon’s proxy environment variables and the bridge MTU. An MTU mismatch between the host interface and the Docker bridge causes silent connection drops during large layer downloads.Verify authentication state. Inspect
~/.docker/config.jsonand re-rundocker login <registry>. Tokens expire, and credential helpers vary by operating system. If re-authenticating fixes the pull, the previous token had expired or the credential helper held stale data.Confirm the tag and architecture exist. A “manifest unknown” error means the tag does not exist or the multi-architecture manifest does not include the host’s platform variant. Try pulling by digest instead of tag, or query the registry to verify the tag is still published.
Check for registry rate limiting. Look for HTTP 429 in the daemon logs or
ratelimit-remainingheaders near zero. If you are limited, authenticate, switch to a mirror, or wait for the window to reset.
Metrics and signals to monitor
| Signal | Why it matters | Warning sign |
|---|---|---|
| Image pull rate | Cache thrashing or pull storms | Sustained rate above deployment frequency with no new containers |
| Pull latency | Delays scaling and recovery | Greater than 3x baseline or sustained latency above 5 minutes |
| Docker disk usage | Blocks layer writes | Greater than 80 percent used |
| Daemon responsiveness | A hung daemon cannot complete pulls | /_ping response above 1 second or timeout |
| Container network errors | Corrupts layer downloads | Nonzero rx_errors or dropped packets on container interfaces |
| Registry rate limit headers | 429 responses | ratelimit-remaining approaching zero |
| Container creation failures | Pull failures surface as create errors | Failure rate above zero for new images |
Fixes
If the cause is registry rate limiting
Authenticate with docker login to move from anonymous to authenticated tier. Deploy a local pull-through registry mirror to reduce external requests. Avoid floating tags such as latest; they force unnecessary re-pulls.
If the cause is authentication failure
Re-run docker login <registry> and verify the entry in ~/.docker/config.json. If you are running in Kubernetes, confirm the pod references an imagePullSecret; Kubernetes does not use the node daemon’s config.json for pod image pulls. For private registries, check that the authentication token has not expired and that the credential helper is storing the secret correctly.
If the cause is network or DNS
Verify DNS resolution for the registry hostname from the host. Check the bridge MTU; a mismatch between the host interface and docker0 causes “unexpected EOF” during large layer downloads. If the host uses an HTTP proxy, ensure the daemon environment includes HTTP_PROXY and HTTPS_PROXY, and restart the daemon after any change. If the proxy terminates TLS, ensure the system trust store includes the proxy’s CA certificate.
If the cause is storage or I/O
Free space under /var/lib/docker by pruning dangling images, truncating oversized container logs, and removing unused volumes. If pulls fail with layer errors after an unclean shutdown, overlay2 metadata may be corrupt. Remove the affected image with docker rmi and re-pull it. If removal fails because a stopped container references the image, remove the container first. docker system prune deletes stopped containers and unused networks. Run it only after confirming what will be deleted.
If the cause is a missing or incorrect image reference
Verify the tag exists in the registry. Pull by digest instead of tag for immutable references. Confirm the image manifest includes the host architecture; multi-arch images that omit the requested platform variant return “manifest unknown”. If the publisher recently updated the image, they may have deleted the tag.
Prevention
Set log rotation and disk cleanup policies. Alert on pull latency above 3x baseline or 5 minutes. Pin images by digest in deployment configs. Use a local registry mirror for frequently pulled base images. Include daemon pull errors in deployment health checks. Set container resource limits so image extraction does not starve the daemon during large pulls.
How Netdata helps
- Compare pull latency against host network I/O, disk I/O, and daemon CPU to locate local vs. external bottlenecks.
- Alert on Docker disk usage before it blocks layer writes.
- Track daemon
/_pinglatency to distinguish registry outages from daemon hangs. - Monitor container creation failure rates; they spike after pull failures.
- Plot image pull rate against registry connectivity errors.
Related guides
- If
docker psordocker inspecthangs while you are diagnosing, see Docker commands hang: docker ps, inspect, and exec freezes. - If the daemon itself becomes unresponsive during pulls, see Docker daemon not responding: how to troubleshoot a hung dockerd.
- If containers exit immediately after a successful pull, see Docker container exits immediately: how to diagnose it.
- For disk space issues that block pulls, see Docker disk space full: how to troubleshoot /var/lib/docker.
- For DNS resolution issues inside containers that can mimic registry failures, see Docker DNS not working inside containers.





