Kafka KRaft quorum has no leader: current-leader = -1 and frozen metadata

Topic creation hangs. Partition reassignments stall. Broker logs show metadata operations timing out. On controller nodes, JMX reports kafka.server:type=raft-metrics,attribute=current-leader with value -1, and the quorum state is frozen. Existing producers and consumers continue to read and write, but the control plane is stuck.

What this means

The quorum leader acts as the active controller. When current-leader is -1, the local controller has not discovered a leader, which means no node in the quorum can commit new entries to the metadata log. Topic creation, deletion, configuration updates, reassignments, and ISR changes are blocked. Leader elections for partitions that lose their broker also cannot proceed.

The data plane is separate from the metadata plane. Partition leaders assigned before the freeze continue to accept produce and fetch requests. Page only if the outage lasts longer than 120 seconds, all controllers have been up for more than 600 seconds, and there is visible data-plane impact such as growing offline partitions or a leader-election storm. Without those conditions, monitor and wait rather than restart controllers.

flowchart TD
    A[KRaft quorum loses leader] --> B[current-leader = -1]
    B --> C[Metadata log frozen]
    C --> D[No topic config or ISR changes]
    C --> E[No new leader elections]
    B --> F[Existing partition leaders]
    F --> G[Data plane continues]
    G --> H{Offline partitions growing?}
    H -->|Yes + uptime > 600s| I[PAGE warranted]
    H -->|No| J[TICKET and monitor]

Common causes

CauseWhat it looks likeFirst thing to check
Network partition between voterscurrent-leader = -1 on all controllers; nodes cannot reach each other on the controller listenerConnectivity and firewall rules between controller endpoints
Voter lag or disk pressurecommit-latency-avg elevated; one voter shows high lag in describe --status outputDisk I/O latency and GC on the lagging node
Controller crash or long GC pauseOne controller drops out of the voter set; broker logs show JVM errorsProcess liveness and GC pause duration
Transient election timeout after restartcurrent-leader flaps to -1 briefly during rolling restart; resolves within secondsController uptimes and election-latency-avg

Quick checks

# Check quorum leader and state on the local controller
echo "get -b kafka.server:type=raft-metrics current-leader" | java -jar jmxterm.jar -l localhost:9999
echo "get -b kafka.server:type=raft-metrics current-state" | java -jar jmxterm.jar -l localhost:9999

# Describe quorum status and inspect voter lag
kafka-metadata-quorum.sh --bootstrap-server localhost:9092 describe --status

# Verify controller process is running
pgrep -f kafka.Kafka

# Check commit and election latency
echo "get -b kafka.server:type=raft-metrics commit-latency-avg" | java -jar jmxterm.jar -l localhost:9999
echo "get -b kafka.server:type=raft-metrics election-latency-avg" | java -jar jmxterm.jar -l localhost:9999

# Check disk latency on the controller node
iostat -xz 1

How to diagnose it

  1. Confirm the scope. Check current-leader on every controller. If only one node reports -1, that node is a partitioned follower and the quorum may still have a leader elsewhere. If all controllers report -1, the quorum truly has no leader.
  2. Inspect quorum status. Run kafka-metadata-quorum.sh ... describe --status and look for LeaderId. If it is -1 or absent, note the CurrentVoters list and their LogEndOffset. A healthy quorum needs a majority of voters reachable.
  3. Identify voter lag. In the same output, inspect voter lag for each node. A voter that is far behind the leader can prevent quorum progress if the leader cannot replicate to a majority.
  4. Correlate with latency metrics. In JMX, check commit-latency-avg and election-latency-avg. If commit latency is elevated but no election is in progress, a slow voter or disk is the likely cause. If election latency is high, network partitions or voter disagreement is delaying consensus.
  5. Check data-plane impact. On any broker, check OfflinePartitionsCount. If it is zero and UnderReplicatedPartitions is stable, the data plane is healthy. If offline partitions are growing, the frozen quorum is preventing leader elections for failed brokers. Monitor LeaderElectionRateAndTimeMs to confirm elections have stalled.
  6. Gate on uptime. If any controller has uptime under 600 seconds, treat the incident as a startup transient. Do not page until nodes have warmed up.
  7. Review controller logs. Search controller logs for election timeouts, quorum epoch changes, or disk errors under org.apache.kafka.raft. A crashed node may log a JVM OutOfMemoryError or Fatal error just before resigning. A network partition may manifest as Connection refused or heartbeat timeouts to peer voters.

Metrics and signals to monitor

SignalWhy it mattersWarning sign
current-leader (raft-metrics)Direct indicator of quorum leadershipValue -1 for more than 120 seconds
commit-latency-avg (raft-metrics)Average time to append to the metadata logSustained above 100 ms
election-latency-avg (raft-metrics)Time required to complete a leader electionGrowing or consistently above 1 second
Voter lag (describe --status)Distance between follower voters and the leaderLag increasing without recovery
ActiveControllerCountWhether the cluster recognizes one controllerCluster-wide sum not equal to 1
OfflinePartitionsCountConfirms data-plane impact from quorum freezeNonzero and growing with no controller uptime excuse
ControllerEventQueueSizeBacklog of metadata work waiting for the controllerAbove 100 and growing continuously
last-applied-record-lag-ms (standby controllers/brokers)Metadata propagation delayGrowing on nodes that should be catching up

Fixes

Transient partition or startup delay

If current-leader dropped to -1 during or immediately after a controller restart, and controller uptimes are under 600 seconds, do not restart anything. Allow the Raft election timeout to complete. KRaft uses randomized backoff to avoid split votes. Interrupting this with manual restarts usually makes recovery slower and can trigger further election delays.

Identify and isolate the failed voter

Use kafka-metadata-quorum.sh describe --status to find the voter with nonzero lag. Check that node for process crashes, disk I/O stalls, or network drops. If the node is alive but slow, give it time to catch up. Restarting healthy voters reduces the quorum size and increases the risk of losing majority, which would make the outage worse.

Recover a lost controller node

If a controller is permanently down, recover it before the quorum loses majority. Start the failed node with its original node.id and log.dirs intact. If the node cannot be recovered, remove it from the quorum voter set so the remaining nodes can form a stable majority. Do not leave dead voters in the configuration.

When to page

Page when current-leader = -1 persists for more than 120 seconds, all controllers have uptime above 600 seconds, and you see growing OfflinePartitionsCount or metadata operations failing. If existing partition leaders are stable and no partitions are going offline, keep the incident at ticket severity and focus on voter recovery.

Prevention

  • Monitor commit latency and voter lag before they trigger leader loss. Alert on commit-latency-avg trending above your baseline and on voter lag growth in the quorum status output.
  • Run controllers on dedicated nodes with independent log.dirs disks. Combined broker-controller mode couples data-plane and control-plane failures and is not recommended for production.
  • Protect controller network paths. The controller listener should be on a stable, low-latency network. Firewall changes affecting the controller port are a common cause of quorum partitions.
  • Maintain an odd voter count. Raft requires a majority. Three voters tolerate one failure; five voters tolerate two. Even numbers provide no additional benefit and increase failure surface.
  • Test quorum failure in game days. Gracefully stop one controller and measure election and recovery time. Stop two controllers and practice the recovery procedure.

How Netdata helps

  • Correlate quorum state across controllers. Netdata collects kafka.server:type=raft-metrics including current-leader, commit-latency-avg, and election-latency-avg from every controller node, showing whether the outage is cluster-wide or isolated to one follower.
  • Gate alerts on data-plane impact. Composite alerts can require current-leader = -1 for more than 120 seconds and OfflinePartitionsCount > 0 and minimum controller uptime above 600 seconds, matching the playbook severity logic.
  • Spot quorum pressure early. Rising commit-latency-avg and increasing voter lag appear alongside broker produce latency, distinguishing a metadata-plane stall from a data-plane bottleneck.
  • Track metadata propagation health. Netdata monitors last-applied-record-lag-ms on standby controllers and brokers, showing how far metadata state drifts while the quorum is frozen.