Kubernetes service not reachable: kube-proxy, endpoints, and DNS
A Service fails when the chain between the client and backend breaks. That chain depends on EndpointSlices to list healthy pods, kube-proxy to program kernel rules, and cluster DNS to resolve names to ClusterIPs. This guide covers the gap between healthy backend pods and an unreachable Service, focusing on kube-proxy data-plane programming, endpoint state, and DNS dependencies. It does not cover application-level bugs inside the pod.
What this means
Reachability is a control-loop problem. kube-proxy watches Services and EndpointSlices, then programs iptables, IPVS, or nftables rules to DNAT traffic to healthy endpoints. These rules persist in the kernel if kube-proxy crashes, but updates stop until it resumes. DNS resolution targets the CoreDNS ClusterIP, so a kube-proxy failure often appears as a DNS outage before a direct Service timeout.
Failures fall into four areas:
- Empty EndpointSlices (no ready addresses).
- Stale or missing kube-proxy rules (sync lag, API disconnect, lock contention).
- Conntrack table exhaustion (silent drops).
- DNS misconfiguration or CoreDNS unavailability.
Common causes
| Cause | What it looks like | First thing to check |
|---|---|---|
| No ready endpoints | kubectl get endpoints <svc> shows <none>; connection refused or timeout. | Pod labels, readiness probe state, rolling update progress. |
| kube-proxy sync lag or stale rules | New Services unreachable from some nodes; existing connections hit terminated pods. | kubeproxy_sync_proxy_rules_duration_seconds, last sync timestamp, kernel rule state. |
| Conntrack table exhaustion | Intermittent timeouts across unrelated Services on one node; DNS fails first; existing connections still work. | nf_conntrack_count vs nf_conntrack_max and conntrack -S drop counters. |
| DNS resolution failure | nslookup returns NXDOMAIN or times out. | CoreDNS pod health, cluster DNS Service endpoints, pod /etc/resolv.conf. |
| NodePort conflict or host firewall | Service works on some nodes but not others; connection refused on specific node IPs. | ss -tlnp for port conflicts; host firewall rules for NodePort range. |
externalTrafficPolicy: Local with no local endpoints | External LB health checks fail on some nodes; traffic dropped on nodes without a backend. | Health check node port (:healthCheckNodePort/healthz) and local pod distribution. |
Quick checks
Run these read-only checks to narrow the blast radius.
# Verify the Service has endpoints
kubectl get endpoints <service-name> -n <namespace>
kubectl get endpointslices -n <namespace> | grep <service-name>
# Check if backend pods are ready and labels match the selector
kubectl get pods -n <namespace> -l <service-selector-key>=<value>
# Test DNS resolution from a client pod
kubectl run -it --rm debug --image=busybox:1.36 --restart=Never -- nslookup <service-name>.<namespace>.svc.cluster.local
# Check kube-proxy health and readiness on the node
curl -s -o /dev/null -w "%{http_code}" http://127.0.0.1:10256/healthz
curl -s -o /dev/null -w "%{http_code}" http://127.0.0.1:10256/readyz
# Check how recently kube-proxy completed a successful sync
curl -s http://127.0.0.1:10249/metrics | grep kubeproxy_sync_proxy_rules_last_timestamp_seconds
# Inspect programmed iptables rules for a Service (iptables mode)
sudo iptables -t nat -L KUBE-SERVICES -n | grep <service-cluster-ip>
sudo iptables -t nat -L -n | grep -c "KUBE-SEP-"
# Inspect IPVS state (IPVS mode)
sudo ipvsadm -Ln | grep <service-cluster-ip>
# Inspect nftables state (nftables mode)
sudo nft list table ip kube-proxy <!-- TODO: verify table name for your Kubernetes version -->
# Check conntrack utilization
awk 'NR==FNR{c=$1;next} {printf "%.0f%%\n", c*100/$1}' /proc/sys/net/netfilter/nf_conntrack_count /proc/sys/net/netfilter/nf_conntrack_max
# Check for conntrack drops
sudo conntrack -S | grep drop
How to diagnose it
Follow this sequence to isolate which layer is dropping traffic.
- Confirm the Service has endpoints. Run
kubectl get endpoints <svc>andkubectl get endpointslices. If the addresses list is empty, the problem is upstream of kube-proxy. Check that the selector matches pod labels and that the pods are ready. If endpoints exist, continue. - Test DNS from the client namespace. DNS queries without a namespace qualifier resolve only within the pod’s own namespace. Run
nslookup <service>.<namespace>.svc.cluster.localfrom the client pod. If this fails, check whether CoreDNS pods are running. Because CoreDNS is usually exposed as a ClusterIP Service, a kube-proxy failure can break DNS before it breaks your application Service. Inspect the client pod’s/etc/resolv.confto ensure the nameserver is the cluster DNS IP and not a loopback stub like127.0.0.53from systemd-resolved. - Check kube-proxy health and sync recency. On the node where the client or backend runs, query
http://127.0.0.1:10256/healthzandhttp://127.0.0.1:10256/readyz. A 503 on readyz means initial sync never completed. Pull the metrics from:10249/metricsand comparekubeproxy_sync_proxy_rules_last_timestamp_secondsagainst the current time. If the gap exceeds twice the sync period, the node is operating on stale rules. - Compare kernel rules to EndpointSlice state. For iptables mode, list the
KUBE-SVC-*andKUBE-SEP-*chains and verify that the endpoint IPs match the addresses in the EndpointSlice. For IPVS mode, useipvsadm -Lnto confirm virtual servers and real servers align with the Service and EndpointSlice. For nftables mode, usenft list table ip kube-proxy. If the rules are missing or point to old pod IPs, kube-proxy is either failing to sync or lagging behind. - Check conntrack saturation on the node. Read
nf_conntrack_countandnf_conntrack_max. If utilization is above 90%, new connections are at risk of being dropped silently. Runconntrack -Sand look for a non-zerodropcounter. If drops are incrementing, conntrack exhaustion is the immediate cause. Useconntrack -Lto see which protocols and states dominate the table. - Verify external traffic path for NodePort and LoadBalancer Services. Check
ss -tlnpon the node to confirm the NodePort is listening and not held by another process. For Services withexternalTrafficPolicy: Local, query the health check node port (curl http://<node-ip>:<healthCheckNodePort>/healthz). A 503 here means the node has no local endpoints, which is expected behavior for that policy, but external load balancers will stop sending traffic to that node.
Metrics and signals to monitor
| Signal | Why it matters | Warning sign |
|---|---|---|
nf_conntrack_count / nf_conntrack_max | Shared kernel resource; when full, new connections drop silently. | Sustained utilization above 70%. |
kubeproxy_sync_proxy_rules_duration_seconds | Measures how long it takes to program rules. If it exceeds the sync period, kube-proxy falls behind. | P99 duration greater than 5 seconds in iptables mode, or approaching the sync period. |
kubeproxy_sync_proxy_rules_last_timestamp_seconds | Indicates the last successful sync. A stale timestamp means rules are frozen even if the process is running. | Timestamp older than 2x the sync period. |
rest_client_requests_total (error codes) | Reveals API server connectivity issues that block watch and sync loops. | Sustained rate of 5xx, 403, or 429 responses. |
workqueue_depth for service and endpointslice | A growing queue means changes arrive faster than kube-proxy can apply them. | Depth consistently above 0 for more than 2 minutes. |
| kube-proxy process restarts / OOM kills | OOM kills cause transient rule absence and stale state on restart. | Any OOMKilled event or more than 2 restarts in 10 minutes. |
| Endpoint to rule consistency (spot check) | Catches silent sync failures where healthz passes but rules are wrong. | Discrepancy between kubectl get endpointslices and kernel rules. |
Fixes
Apply fixes that match the layer you isolated in diagnosis.
If the cause is missing or unready endpoints
Correct the Service selector to match pod labels. Fix readiness probes so they reflect actual serving capacity. If a rolling update left zero ready replicas, pause or roll back the deployment. A Service with zero endpoints causes kube-proxy to install a REJECT rule. This is correct behavior, not a kube-proxy bug.
If the cause is kube-proxy sync lag or stale rules
If sync duration is high due to iptables rule bloat in a large cluster, temporarily increase the sync period to allow full syncs to complete. Plan a migration to IPVS or nftables mode, which scale better than iptables. If the sync is stuck because of API server disconnect, check network paths to the control plane and verify RBAC for the system:node-proxier ClusterRole. Restarting the kube-proxy pod on the affected node forces a full re-sync, but treat this as a temporary recovery step, not a root cause fix.
If the cause is conntrack exhaustion
Immediate relief: raise nf_conntrack_max. This takes effect instantly but consumes kernel memory (roughly 300 bytes per entry).
# Warning: state-changing command
sudo sysctl -w net.netfilter.nf_conntrack_max=262144
Then investigate the source. High TIME_WAIT counts suggest short-lived HTTP connections without pooling. High UDP counts suggest aggressive DNS or metrics traffic. Tune protocol-specific timeouts if your workload allows, and fix application connection leaks.
If the cause is DNS resolution
Ensure CoreDNS pods are running and that their Service has endpoints. Verify that client pods use the cluster DNS IP in /etc/resolv.conf. If the node uses systemd-resolved, ensure kubelet is configured with --resolv-conf=/run/systemd/resolve/resolv.conf to avoid a stub resolver loop. If the upstream resolv.conf passed to pods contains more than 3 nameservers, reduce the list or use a local dnsmasq sidecar, because glibc limits nameservers to 3.
If the cause is NodePort conflict or external traffic policy
Identify the conflicting process with sudo ss -tlnp | grep <node-port> and either stop it or reallocate the NodePort. For externalTrafficPolicy: Local, either change the policy to Cluster (which permits cross-node forwarding but obscures source IP) or ensure the workload is scheduled so that every node receiving traffic has at least one local ready endpoint.
Prevention
Monitor conntrack utilization per node and alert at 70%, not at 100%. Track the trend of kubeproxy_sync_proxy_rules_duration_seconds against your sync period. When P99 duration crosses 50% of the sync period, start planning a mode migration or Service consolidation. Size kube-proxy memory limits with at least 2x headroom above steady-state RSS to accommodate re-list spikes after watch disconnects. Require readiness probes that are precise, and validate them before deployments. Restrict NodePort allocations to an approved range and audit ExternalIP usage via RBAC. Periodically spot-check that kernel rules match EndpointSlice state, especially after cluster upgrades or CNI changes.
How Netdata helps
Use per-node kernel and container metrics to narrow the scope:
- Correlate
nf_conntrack_countwith connection error rates to confirm conntrack exhaustion before drops appear. - Track kube-proxy sync duration and API latency to catch sync lag or watch disconnect.
- Alert on conntrack saturation and kube-proxy process restarts alongside Service timeouts.
- Check CPU softirq time and network drops to distinguish kube-proxy overhead from physical network issues.
Related guides
- Kubernetes API server slow or unresponsive: causes and fixes
- Kubernetes kubelet not responding: PLEG, runtime, and certificate issues
- Kubernetes monitoring checklist: the signals every production cluster needs
- Kubernetes node NotReady: kubelet, runtime, and network diagnosis
- Kubernetes pod stuck ContainerCreating: volume, network, and image issues
- Kubernetes pod CrashLoopBackOff: causes, diagnosis, and fixes
- Kubernetes pod ImagePullBackOff: registry, auth, and network diagnosis
- Kubernetes pod OOMKilled: cgroup limits, evictions, and fixes
- Kubernetes pod stuck Pending: scheduling failures explained





