Kubernetes imagePullSecrets: configuration, propagation, and rotation

Pods stuck in ImagePullBackOff are rarely caused by a missing or incorrect image tag. More often, the kubelet lacks valid registry credentials. Kubernetes uses imagePullSecrets, namespaced Secrets of type kubernetes.io/dockerconfigjson, to inject registry auth into a pod. These can be attached directly to the pod spec or propagated through a ServiceAccount.

This guide covers propagation from registry to kubelet, verification at each link, and rotation without forcing a rolling restart of every workload.

What this means

Kubelet pulls images on behalf of the pod. For private registries, it needs a .dockerconfig.json equivalent, stored as a Secret of type kubernetes.io/dockerconfigjson in the same namespace as the pod. Kubelet reads the Secret from the pod spec or from the pod’s ServiceAccount.

Attach the Secret to a ServiceAccount and every pod using that account inherits it automatically. This is the standard pattern for multi-pod authentication. Attach it directly to the pod spec and only that pod receives it. Namespace boundaries are strict: a Secret in namespace A cannot be used by a pod in namespace B. Static pods defined on the node filesystem cannot reference API server secrets; they must rely on kubelet credential provider plugins.

flowchart TD
    A[Registry credentials] --> B[Secret type dockerconfigjson]
    B --> C{Attached to ServiceAccount?}
    C -->|Yes| D[ServiceAccount imagePullSecrets]
    C -->|No| E[Pod spec imagePullSecrets]
    D --> F[Pod inherits secrets]
    E --> F
    F --> G[Kubelet pulls image]
    G --> H{Auth valid?}
    H -->|Yes| I[Container starts]
    H -->|No| J[ImagePullBackOff]

Common causes

CauseWhat it looks likeFirst thing to check
Secret name mismatchImagePullBackOff with secret not found in eventskubectl get secrets -n <ns>
Wrong registry hostname in secretPull fails with 401 even though the Secret existsDecode secret and compare registry URL to image prefix
Expired credentialsIntermittent pulls failing after months of stabilitySecret creation timestamp and registry token expiry
ServiceAccount not patchedNew pods fail despite Secret existingkubectl get sa <sa> -n <ns> -o yaml
Pod uses wrong ServiceAccountSame as above, but default SA is unpatchedPod spec serviceAccountName
Namespace isolationSecret exists but is in a different namespaceSecret namespace vs pod namespace
Static pod limitationStatic pod in /etc/kubernetes/manifests/ cannot pull private imageCheck if pod is static; use credential provider plugin instead

Quick checks

# Check if the secret exists and has the right type
kubectl get secret regcred -n <namespace> -o jsonpath='{.type}'

# Inspect the decoded .dockerconfigjson
kubectl get secret regcred -n <namespace> -o jsonpath="{.data['.dockerconfigjson']}" | base64 -d

# Check which imagePullSecrets are attached to a ServiceAccount
kubectl get serviceaccount default -n <namespace> -o jsonpath='{.imagePullSecrets}'

# Check which ServiceAccount a pod is using
kubectl get pod <pod> -n <namespace> -o jsonpath='{.spec.serviceAccountName}'

# Check kubelet credential provider flags on the node
ps aux | grep kubelet | grep -E 'image-credential-provider'

# Check pod events for the specific pull error
kubectl get events --field-selector involvedObject.name=<pod> -n <namespace> --sort-by='.lastTimestamp'

How to diagnose it

  1. Read the event message. Failed to pull image with an authentication error means the problem is credentials, not the image tag or network.
  2. Verify the referenced Secret exists in the same namespace. If it does not, create it or correct the reference.
  3. Decode the Secret’s .dockerconfigjson and confirm the registry hostname matches the image prefix exactly. A secret for registry.example.com will not authenticate for registry.example.com:5000 unless the port is included.
  4. Check the ServiceAccount. If the pod uses the default ServiceAccount and that account has no imagePullSecrets, credentials will not be injected. Patch the ServiceAccount, then delete the pod to recreate it.
  5. Verify credential freshness. Old secrets may have expired tokens. Test the credential with docker login or the registry CLI.
  6. Check the node type. If the pod is a static pod, it cannot use imagePullSecrets. Look for kubelet credential provider plugin configuration instead.
  7. If multiple imagePullSecrets reference the same registry, the container runtime tries them in list order. Overlapping credentials between ServiceAccount and pod sources can create ambiguity. Reduce to one secret per registry.

Metrics and signals to monitor

SignalWhy it mattersWarning sign
Pod phase Pending with ImagePullBackOffDirect indicator of pull failureAny pod in this state longer than 5 minutes
kubelet_runtime_operations_errors_total{operation_type="pull_image"}Kubelet is failing to pull imagesSustained increase on a node
kubelet_image_pull_duration_secondsRegistry auth or network latency is degradingp99 latency trending upward
Pod startup latencyImage pull is blocking deployment velocityp99 startup time > 30s during rollouts
Container restart countCrashLoopBackOff from fallback to missing imageRestart count increasing after pull failures
Warning events FailedKubernetes is surfacing the exact errorSpike in events with reason Failed and message containing Failed to pull image

Fixes

If the secret is missing or misnamed

Create a new kubernetes.io/dockerconfigjson Secret in the correct namespace with --from-file=.dockerconfigjson=<path> or by constructing the JSON manually. Update the pod spec or ServiceAccount to reference the exact Secret name.

If the ServiceAccount is not propagating the secret

Patch the ServiceAccount to include the secret in its imagePullSecrets array. Existing pods do not pick up the change automatically. Delete the pods to recreate them, or wait for natural churn.

If credentials expired

Rotate the password or token at the registry first. Then update the Secret. If the pod spec references the Secret directly, kubelet reads the updated Secret on the next pull attempt. If the image is already cached on the node, kubelet does not re-pull and will not verify the new credential against the registry until the image is evicted or a new node is used. Plan rotations before expiry or force a pod reschedule to a clean node to validate.

If you are running static pods

Move registry authentication to a kubelet credential provider plugin configured via the kubelet flags --image-credential-provider-bin-dir and --image-credential-provider-config. Static pods cannot read API server secrets.

If you cannot restart pods immediately during rotation

Create a second Secret with the new credentials. Patch the ServiceAccount to reference both secrets temporarily. Wait for natural pod churn to propagate the new credential, then remove the old secret from the ServiceAccount.

Prevention

Attach imagePullSecrets to ServiceAccounts, not individual pods. This centralizes credential management and reduces drift. Use a namespace-scoped Secret per registry. Do not use the discouraged kubernetes.io/dockercfg type.

Prefer kubelet credential provider plugins for cloud registries instead of long-lived secrets where possible. This removes the Secret from the API server entirely and delegates authentication to the node.

Implement a rotation runbook that creates new secrets before invalidating old ones. Avoid in-place Secret updates that rely on kubelet re-reading the same object during runtime, because cached images may mask credential expiry until a new node is provisioned.

Monitor kubelet_runtime_operations_errors_total and pod ImagePullBackOff rates as leading indicators. Set alerts on sustained image pull failures rather than single pod restarts.

How Netdata helps

  • Correlates kubelet image pull error rates with pod restart counts to distinguish registry auth failures from application crashes.
  • Surfaces pod startup latency spikes during secret rotation events.
  • Tracks node-level ImagePullBackOff pod phases alongside CRI operation latency to pinpoint whether the bottleneck is registry auth, network, or disk I/O.
  • Alerts on sustained increases in kubelet_runtime_operations_errors_total for pull operations before workloads enter CrashLoopBackOff.