Cassandra node showing DN in nodetool status: gossip, phi, and recovery

You run nodetool status and one of your nodes shows DN (Down/Normal). Before you restart anything, understand that this output is the local node’s opinion, not global truth. In Cassandra’s peer-to-peer architecture, every node runs its own phi accrual failure detector over gossip heartbeats. A DN mark means this specific observer has not heard from the target for roughly 18 seconds at default settings. Another node in the same cluster may still show the same target as UN.

This asymmetry is expected during network partitions, but it is also the signature of a long GC pause. When a Cassandra JVM freezes for more than the phi conviction window, the observing node marks its peers as down even though those peers are healthy. Conversely, if the target node itself is paused, all observers mark it down. Distinguishing between a true network failure, a paused target, and a paused observer is the core of the diagnosis.

Operators often assume the DN node is the faulty one, but the truth depends on which side of the conversation failed. If the observer paused, the target is innocent. If the target paused, every observer is correct. Network partitions split the difference: each side thinks the other is down. The phi accrual detector does not distinguish between these scenarios; it only measures silence. Flapping is worse than a stable down state. A node that oscillates between UP and DOWN triggers hint storms, repeated coordinator rerouting, and unnecessary repair churn. If you see rapid transitions, suspect GC pressure or an overloaded gossip stage rather than a hardware failure.

What this means

Cassandra uses a phi accrual failure detector on top of the gossip protocol. Every second, each node exchanges heartbeats with one to three peers and tracks inter-arrival times in a sliding window. The detector computes a phi value from the mean and variance of these times:

phi = PHI_FACTOR x time_since_last_gossip / mean_heartbeat_interval

where PHI_FACTOR is approximately 0.434 (1 / log(10)). With the default phi_convict_threshold of 8 and a 1-second gossip interval, a node must miss about 18 seconds of heartbeats before the local observer marks it DOWN. The threshold is tunable in cassandra.yaml within the range 5 to 12. Higher values tolerate more latency variance; lower values detect failures faster but increase false positives.

When a node is marked DN, the cluster stops sending live writes to it for the affected token ranges. Coordinators begin storing hinted handoffs locally for up to max_hint_window_in_ms (default 3 hours). If the node remains down beyond that window, hints cease and the replica becomes permanently inconsistent for that period until you run nodetool repair.

Because each node evaluates phi independently, two nodes can legitimately disagree about a third during a partition. The node that sees DN may be the one with the problem, not the target.

flowchart TD
    A[nodetool status shows DN] --> B{Same view from all nodes?}
    B -->|Yes| C[Target is dead, GC-frozen, or fully isolated]
    B -->|No| D[Partial partition or observer-side issue]
    C --> E{Reachable on port 7000?}
    E -->|Yes| F[Check target GC and heap]
    E -->|No| G[Fix network route or firewall]
    D --> H{Observer GC pause > 18s?}
    H -->|Yes| I[Restart observer and check heap tuning]
    H -->|No| J[Asymmetric partition; test port 7000 from observer]

Common causes

CauseWhat it looks likeFirst thing to check
Network partition or firewall blockDN on some nodes, UN on others; partial visibilitync -zv <target-ip> 7000 from the observer
Long GC pause on observer nodeObserver shows peers as DN; peers are healthyGC logs on the observing node for pauses > 18s
Long GC pause on target nodeAll observers show the target as DN simultaneouslyGC logs and heap usage on the target
Gossip disabled by operator (CASSANDRA-8554)Target sees itself as UP; all peers see it as DNWhether nodetool disablegossip was run
Invalid gossip generation (CASSANDRA-10969)DN after restart with “invalid gossip generation” warningsSystem logs for generation mismatch
Clock skewMutual DN markings and schema disagreementNTP sync status across the cluster

Quick checks

# Check the local node's view of the ring
nodetool status

# Compare views from multiple nodes to detect asymmetry
for h in node1 node2 node3; do echo "=== $h ==="; ssh $h "nodetool status"; done

# Inspect local gossip and failure detector state
nodetool gossipinfo

# Test layer-4 reachability for gossip
nc -zv <target-ip> 7000

# View recent GC pauses (adjust path to your GC log)
grep -i "pause" /var/log/cassandra/gc.log* | tail -20

# Check for gossip generation warnings
grep -i "invalid gossip generation" /var/log/cassandra/system.log

# Verify schema agreement (partitioned nodes often diverge)
nodetool describecluster

# Check NTP synchronization
chronyc tracking

How to diagnose it

  1. Establish whether the view is symmetric. Run nodetool status from at least three nodes. If some peers still show the target as UN, you have a partial partition or an observer-side issue.
  2. Identify the observer. Pick one node that reports the target as DN. Check its GC logs for stop-the-world pauses exceeding 18 seconds. If the observer itself froze, it falsely convicted the target. This is a common misdiagnosis where operators blame the network when the root cause is heap pressure. If TRACE logging is enabled, look for “has already a pending echo, skipping it” in gossip logs on the observer. This pattern appears when the JVM is too paused to process incoming gossip acknowledgements.
  3. Check the target. If every node shows the target as DN, inspect the target’s logs, CPU, and GC behavior. A target that is alive but frozen in a long GC pause will not respond to gossip or accept new connections even though the process is running.
  4. Test connectivity. From the observing node, run nc -zv <target-ip> 7000. If this fails, the gossip port is blocked by a firewall, security group, or routing issue. This is the single most common cause of persistent asymmetric DN visibility.
  5. Inspect gossip state. Run nodetool gossipinfo on the observer. Look at the STATUS field and the generation number. Stale generation stamps can persist across restarts (CASSANDRA-10969) and require multiple rolling restarts to purge. Upgrade to 2.1.13+ or the equivalent DSE patch for a permanent fix.
  6. Check for disabled gossip. If the target was drained or had gossip disabled via nodetool disablegossip, it will appear DN to peers while remaining UP locally. Re-enable with nodetool enablegossip.
  7. Verify clock synchronization. Run date or ntpdate -q across nodes. Drift greater than two seconds prevents gossip convergence and can cause generation mismatches.
  8. Evaluate hint expiration. If the node was down for more than three hours, hints have stopped accumulating and data for that window is missing on the replica. Plan a full nodetool repair after the node recovers.

Metrics and signals to monitor

SignalWhy it mattersWarning sign
DownEndpointCount (JMX FailureDetector)Direct count of nodes considered down by this observer> 0 sustained longer than 5 minutes
GC pause durationPauses longer than the phi window (~18 s at default) trigger false DN convictionsMax pause > 2 s; critical if > 18 s
Heap usage after old GCHeap pressure drives the GC pauses that cause gossip failures> 75% of max heap after full GC
GOSSIP thread pool pending/blockedA backed-up gossip stage prevents heartbeat processing and causes flappingPending or blocked tasks sustained
phi_convict_thresholdControls sensitivity of the failure detectorDefault 8; raise to 10-12 on EC2 or noisy networks
Hints directory sizeHints indicate recent DN events and consume diskGrowing when all nodes should be UP
Schema versions (describecluster)Partitioned or paused nodes often fail to propagate schema changesMore than one schema UUID sustained
Client request unavailablesHard failures when consistency level cannot be metEscalate if DN nodes cause quorum loss

Fixes

Network partition or firewall

Restore bidirectional TCP connectivity on port 7000 (or 7001 when TLS is enabled). Gossip requires both sides to see each other. Do not restart Cassandra nodes until the network path is confirmed; restarting into a partition can deepen schema disagreement and prolong convergence.

GC pause on observer causing false conviction

If the observing node falsely marked peers as DN because its own JVM paused, the fix is heap tuning, not network repair. As an emergency measure, restart the pausing node to clear the heap and reset its failure detector state. Adjacent nodes may also need a restart if they cached the false DOWN state through gossip propagation. For long-term remediation, see Cassandra GC death spiral and Cassandra heap pressure.

Target node genuinely down or frozen

If the target is hung or crashed, restart it. Once it returns to UN, check how long it was down. If the outage exceeded max_hint_window_in_ms (default 3 hours), run nodetool repair on the recovered node to reconcile the data missed after hints expired. Schedule the repair during a low-traffic window; it is I/O intensive.

Invalid gossip generation (CASSANDRA-10969)

If logs show “invalid gossip generation” after a restart, stale generation stamps are circulating in gossip state. Perform rolling restarts across the cluster to purge the stale state, then upgrade to a patched version (2.1.13+, DSE 4.7.8+, or 4.8.5+).

Gossip disabled by operator

Run nodetool enablegossip on the target node. Verify from a peer that the node transitions back to UN.

Prevention

  • Raise phi_convict_threshold on cloud instances. In cassandra.yaml, increase the threshold from 8 to 10-12 for AWS EC2 or other high-latency-variance environments. This dampens flapping from transient jitter. Lower it to 5-6 only on ultra-stable LANs. Higher values delay detection of real failures, so tune conservatively.
  • Alert on GC pauses before they breach the phi window. A pause that exceeds 18 seconds is already an incident. Alert on max pause > 2 seconds to get ahead of the spiral.
  • Keep clocks synchronized. Run NTP on every node and alert if drift exceeds 100 ms.
  • Separate commitlog and data disks. I/O saturation on a shared disk can stall gossip threads indirectly.
  • Automate repair tracking. If a node flaps or stays down, ensure you alert when the last successful repair approaches gc_grace_seconds. After any outage longer than the hint window, repair is mandatory before the node is considered fully recovered.
  • Avoid nodetool disablegossip in automation. Unless it is part of a documented maintenance window, disabling gossip creates exactly the DN confusion this article describes.

How Netdata helps

  • Correlate DownEndpointCount with GC pause duration on the observing node. A DN event without a network failure, but with a coinciding GC pause > 18 s, points to a false conviction.
  • Track org.apache.cassandra.net:type=FailureDetector JMX attributes alongside JVM heap usage to surface the GC-gossip relationship.
  • Alert on sustained DownEndpointCount > 0 for longer than 5 minutes, filtering out brief blips from rolling restarts.
  • Monitor java.lang:type=GarbageCollector pause times per node to distinguish observer-side GC from target-side failure.
  • Surface nodetool status asymmetry by collecting the metric from multiple nodes and comparing views.
  • Cassandra compaction strategies: STCS vs LCS vs TWCS vs UCS: /guides/cassandra/cassandra-choosing-compaction-strategy/
  • Cassandra compaction death spiral: when writes outrun compaction throughput: /guides/cassandra/cassandra-compaction-death-spiral/
  • Cassandra consistency levels explained: QUORUM, ONE, LOCAL_QUORUM, and EACH_QUORUM: /guides/cassandra/cassandra-consistency-levels-explained/
  • Cassandra zombie data resurrection: gc_grace_seconds and unrepaired tombstones: /guides/cassandra/cassandra-data-resurrection-gc-grace/
  • Cassandra disk space exhaustion: emergency recovery when the data volume fills: /guides/cassandra/cassandra-disk-space-exhaustion/
  • Cassandra dropped mutations: silent write loss and load shedding: /guides/cassandra/cassandra-dropped-mutations/
  • Cassandra GC death spiral: long pauses, gossip flapping, and recovery: /guides/cassandra/cassandra-gc-death-spiral/
  • Cassandra GC pauses too long: diagnosing G1 stop-the-world pauses: /guides/cassandra/cassandra-gc-pauses-too-long/
  • Cassandra heap pressure: sizing the JVM heap and tuning G1GC: /guides/cassandra/cassandra-heap-pressure-tuning/
  • Cassandra monitoring checklist: the signals every production cluster needs: /guides/cassandra/cassandra-monitoring-checklist/
  • Cassandra monitoring maturity model: from survival to expert: /guides/cassandra/cassandra-monitoring-maturity-model/
  • Cassandra Not enough space for compaction: STCS space amplification and recovery: /guides/cassandra/cassandra-not-enough-space-for-compaction/