Kafka LEADER_NOT_AVAILABLE: causes during elections, restarts, and topic creation
LEADER_NOT_AVAILABLE means a client asked a broker to produce or fetch from a partition that has no assigned leader. In healthy clusters this is brief during rolling restarts, controller elections, or topic creation. Persistent errors correlate with OfflinePartitionsCount > 0 and indicate the data plane is broken for those partitions. Distinguish this from NOT_LEADER_FOR_PARTITION, which means a leader exists but the client contacted the wrong broker and needs a metadata refresh.
What this means
A partition serves reads and writes only through its leader. The active controller assigns leadership. When a broker receives a produce or fetch request for a partition with no leader, it returns LEADER_NOT_AVAILABLE and the client retries after refreshing metadata.
Normal operation:
- A broker is restarting and its partitions are undergoing leader election.
- The controller is moving leadership during a preferred replica election.
- A new topic was created but leader assignments have not yet propagated.
Abnormal operation:
- Every replica is offline or out of sync and
unclean.leader.election.enable=false. - The controller is absent, crashed, or its event queue is backed up and elections stall.
- A broker is network-partitioned but not fully down, preventing the controller from cleaning up leadership.
flowchart TD
A[Client sees LEADER_NOT_AVAILABLE] --> B{Transient?}
B -->|Yes: restart, election, new topic| C[Wait and let client retry]
B -->|No: persists >60s| D{ActiveControllerCount == 1?}
D -->|No| E[Fix controller or quorum]
D -->|Yes| F{OfflinePartitionsCount > 0?}
F -->|Yes| G[Find leaderless partitions and replica state]
F -->|No| H[Treat as NOT_LEADER_FOR_PARTITION stale metadata]
G --> I{ISR is empty?}
I -->|Yes| J[Recover replicas or accept unclean election]
I -->|No| K[Check controller queue and election latency]Common causes
| Cause | Symptoms | First check |
|---|---|---|
| Transient leader election during rolling restart or broker failure | LeaderElectionRateAndTimeMs spikes; OfflinePartitionsCount and UnderReplicatedPartitions briefly rise, then fall | ActiveControllerCount equals 1 and election p99 is under 1 second |
| New topic metadata not yet propagated | Errors target only the new topic; other topics are healthy | kafka-topics.sh --describe for the topic and watch leaders appear |
| No ISR available for the partition | OfflinePartitionsCount stays above 0; UncleanLeaderElectionsPerSec is 0 | kafka-topics.sh --describe --unavailable-partitions and broker liveness |
| Controller loss or event queue backup | ActiveControllerCount is not 1, or ControllerEventQueueSize grows without draining | Controller broker logs, ZK session state, or KRaft quorum health |
| Network-partitioned or flapping broker | Broker process is up but unreachable; ISR shrinks on cluster leaders; follower fetch latency rises | Network reachability between brokers, dmesg, interface counters |
Quick checks
# Leaderless partitions
kafka-topics.sh --bootstrap-server localhost:9092 --describe --unavailable-partitions
# Under-replicated partitions
kafka-topics.sh --bootstrap-server localhost:9092 --describe --under-replicated-partitions
# Active controller count (cluster-wide sum must equal 1)
echo "get -b kafka.controller:type=KafkaController,name=ActiveControllerCount Value" | java -jar jmxterm.jar -l localhost:9999
# Controller event queue depth
echo "get -b kafka.controller:type=ControllerEventManager,name=EventQueueSize Value" | java -jar jmxterm.jar -l localhost:9999
# Leader election rate and latency
echo "get -b kafka.controller:type=ControllerStats,name=LeaderElectionRateAndTimeMs OneMinuteRate" | java -jar jmxterm.jar -l localhost:9999
echo "get -b kafka.controller:type=ControllerStats,name=LeaderElectionRateAndTimeMs 99thPercentile" | java -jar jmxterm.jar -l localhost:9999
# ISR shrink/expand velocity
echo "get -b kafka.server:type=ReplicaManager,name=IsrShrinksPerSec OneMinuteRate" | java -jar jmxterm.jar -l localhost:9999
echo "get -b kafka.server:type=ReplicaManager,name=IsrExpandsPerSec OneMinuteRate" | java -jar jmxterm.jar -l localhost:9999
# KRaft quorum state
kafka-metadata-quorum.sh --bootstrap-server localhost:9092 describe --status
# Broker process liveness
systemctl status kafka
ss -tnlp | grep 9092
How to diagnose it
Separate
LEADER_NOT_AVAILABLEfromNOT_LEADER_FOR_PARTITION.LEADER_NOT_AVAILABLEmeans no leader exists.NOT_LEADER_FOR_PARTITIONmeans the broker contacted is not the leader and the client metadata is stale. If only some clients complain and a metadata refresh fixes it, you are dealing with stale metadata, not a leaderless partition.Check if the error is transient. Examine the last few minutes of
LeaderElectionRateAndTimeMs,OfflinePartitionsCount, andUnderReplicatedPartitions. If these spike and recover within 30-60 seconds of a rolling restart or topic creation, this is expected.Verify controller health. Sum
ActiveControllerCountacross all brokers. It must equal exactly 1. If it does not, or ifControllerEventQueueSizeis consistently above 100 and growing, the controller is the bottleneck.Identify offline partitions. Run
kafka-topics.sh --describe --unavailable-partitions. This lists the topic, partition, and replica list. Cross-reference with broker liveness.Check replica state. In the describe output, look at the ISR. If the ISR is empty and all replicas are on down brokers, the partition stays offline until a replica recovers or an unclean election is allowed.
Investigate broker liveness. Check process state, port reachability, recent restarts, disk I/O latency, and network partitions. A broker that is up but unreachable produces the same symptoms as a dead broker.
Check replication health on surviving leaders. Elevated
IsrShrinksPerSec,UnderReplicatedPartitions, and follower fetch latency indicate that followers are being removed from the ISR, which can push more partitions belowmin.insync.replicasor to zero ISR.Check metadata store health. In KRaft mode, verify the quorum has a leader and acceptable commit latency. In ZooKeeper mode, check
ZooKeeperRequestLatencyMsp99 andZooKeeperExpiresPerSecfor session expirations that can eject the controller.Read broker and controller logs. Search for
ERRORlines around leader election, controller events, and network timeouts. Logs often reveal the first failure before metrics show the full impact.
Metrics and signals to monitor
| Signal | Why it matters | Warning sign |
|---|---|---|
OfflinePartitionsCount | Direct measure of leaderless partitions | Nonzero for more than 60 seconds |
ActiveControllerCount | Exactly one controller must exist to assign leaders | Cluster-wide sum is not 1 |
LeaderElectionRateAndTimeMs | Spikes during failures; high latency means elections stall | Rate spikes outside maintenance, or p99 above 1 second |
ControllerEventQueueSize | Pending metadata operations; a backed-up queue delays elections | Consistently above 100, or growing above 1000 |
UnderReplicatedPartitions | Replicas are falling behind and may leave the ISR | Nonzero and growing outside maintenance |
IsrShrinksPerSec | Velocity of replicas leaving the ISR | Sustained above 0 for more than 5 minutes |
FailedProduceRequestsPerSec | Direct producer-visible impact | Sustained nonzero rate |
| KRaft quorum state | Metadata-plane health in KRaft mode | current-leader = -1, or high commit latency |
ZooKeeperRequestLatencyMs / ZooKeeperExpiresPerSec | ZK health in ZK mode; high latency or expiry can kill the controller | p99 above 1 second, or any session expiry |
Fixes
Transient errors during elections, restarts, or topic creation
Do not restart brokers or recreate topics while the controller is electing leaders. Let the controller finish and let clients retry with metadata refreshes. If producers are especially sensitive, confirm they are configured to retry and refresh metadata on LEADER_NOT_AVAILABLE. For new topics, wait until kafka-topics.sh --describe shows leaders assigned before directing traffic to the topic.
Persistent offline partitions with no ISR
If all replicas are down and unclean.leader.election.enable=false, the partition stays offline until a replica returns. Priorities:
- Restore the failed brokers or fix the network partition.
- If a broker is permanently lost, use
kafka-reassign-partitions.shto move replicas to healthy brokers. This triggers large data movement. - Warning: As a last resort, if data loss is acceptable, enable unclean leader election temporarily on the topic, then disable it immediately after recovery. This can silently truncate acknowledged writes.
Controller loss or queue backup
If ActiveControllerCount is not 1:
- Identify why the controller was lost. Common causes are JVM OOM, long GC pauses, ZK session expiry, or KRaft quorum partition.
- Do not restart additional brokers; that generates more controller events.
- In KRaft mode, check voter connectivity and quorum logs. In ZK mode, check ZK latency and whether the ensemble has quorum.
- If the controller queue is large but a controller exists, monitor drain rate. If it is draining, wait. If it is growing, reduce load by stopping admin operations and reassignment jobs.
Network-partitioned or flapping broker
A broker that is alive but partitioned can hold leadership without acknowledging followers, or prevent the controller from cleaning up its state. If the network issue cannot be resolved quickly, use controlled shutdown to remove the broker and let the cluster elect clean leaders on reachable replicas.
Prevention
- Monitor
OfflinePartitionsCountandActiveControllerCountwith paging thresholds. Do not rely onUnderReplicatedPartitionsalone; under-replication is normal during restarts, but offline partitions are not. - Set
min.insync.replicas=2forreplication.factor=3topics. This prevents the write path from degrading to a single replica, reducing the chance of a leaderless partition after one broker failure. - Leave
unclean.leader.election.enable=falseunless you explicitly accept data loss. The default has been safe since Kafka 0.11.0.0. - Size controller nodes for your partition count. Give them dedicated ZK or KRaft resources, and watch
ControllerEventQueueSizeduring normal operations to detect creeping overload. - Use rack-aware replication. A rack failure should shrink the ISR but not take all replicas of a partition offline.
- Run game-day rolling restarts and broker failures. Measure how long leader elections take, how high
OfflinePartitionsCountspikes, and how long ISR recovery takes. Use that to set alert thresholds and maintenance windows.
How Netdata helps
- Correlates
OfflinePartitionsCount,ActiveControllerCount,UnderReplicatedPartitions, andControllerEventQueueSizein one view to separate transient elections from persistent leaderless partitions. - Surfaces Kafka request latency breakdowns (
RequestQueueTimeMs,LocalTimeMs,RemoteTimeMs) to show whether the bottleneck is replication, disk I/O, or thread saturation. - Tracks KRaft quorum health and ZooKeeper latency alongside broker metrics, making metadata-plane failures visible without switching tools.
- Alerts on disk I/O latency, page cache pressure, and network retransmits that often precede ISR shrinks and leaderless partitions.
- Provides per-broker
LeaderCountandPartitionCountviews to catch leadership imbalance that can overload the broker most likely to lose leadership first.
Related guides
- How Kafka actually works in production: a mental model for operators
- Kafka ISR shrinking: IsrShrinksPerSec, flapping, and the cascade to offline
- Kafka monitoring checklist: the signals every production cluster needs
- Kafka monitoring maturity model: from survival to expert
- Kafka NotEnoughReplicasException: acks=all writes rejected below min.insync.replicas
- Kafka replica MaxLag growing: slow followers and replica fetcher health
- Kafka UnderMinIsrPartitionCount: confirming the write path is blocked
- Kafka UnderReplicatedPartitions > 0: the most important metric and how to clear it







