Cassandra schema disagreement: nodetool describecluster shows multiple versions
nodetool describecluster should report exactly one schema version UUID cluster-wide. When the Schema versions section lists more than one UUID, the cluster is in schema disagreement. DDL (CREATE, ALTER, DROP) fails or hangs until every node converges. DML and reads against existing tables are unaffected.
Transient disagreement lasting less than five minutes during or immediately after a DDL change is normal. Cassandra propagates schema mutations asynchronously via gossip. If multiple versions persist beyond five minutes, you have a stuck node, a partitioned peer, or a migration stage backlog.
What this means
Cassandra stores schema metadata in local system tables and versions the entire schema as a single UUID. When a DDL statement executes, the coordinator proposes a new schema mutation, updates its local version, and gossips the change to the rest of the ring. Each node applies the mutation in its Migration stage and advertises the new version.
If a node is DOWN, unreachable, or its migration thread is stalled, it never applies the mutation. That node retains the old schema UUID while the rest of the cluster moves forward. Cassandra serializes schema changes globally, so it refuses the next DDL until the outlier catches up.
The schema version is a digest of the entire schema definition. Any structural difference produces a different UUID. One UUID means agreement; more than one means divergence.
flowchart TD
A[nodetool describecluster shows >1 schema version] --> B{Transient? <5 min after DDL}
B -->|Yes| C[Wait for gossip propagation]
B -->|No| D{Outlier node status}
D -->|DOWN| E[Restore node liveness first]
D -->|UP| F[Check nodetool tpstats Migration pending]
F -->|Pending > 0| G[Wait or drain node]
F -->|Pending = 0| H[Run nodetool resetlocalschema on outlier]
E --> I[Re-check describecluster]
G --> I
H --> I
I -->|Still disagreeing| J[Graceful rolling restart of outlier]
I -->|One version| K[Resolved]
J --> KCommon causes
| Cause | Looks like | First check |
|---|---|---|
| Transient gossip propagation | Multiple versions for seconds to minutes after DDL | nodetool describecluster after 60 seconds |
| Node DOWN or unreachable | One UUID lists a node that is DN in nodetool status | nodetool status |
| Migration stage backlog | Schema changes hang; nodetool tpstats shows pending Migration tasks | nodetool tpstats |
| Rolling upgrade in progress | Nodes on different Cassandra versions show different UUIDs | Version strings in nodetool status or logs |
| Partitioned or zombie node | Node is UP but retains an old version; may appear in UNREACHABLE | Last line of nodetool describecluster |
Quick checks
# Schema version distribution. Each UUID should list all nodes.
nodetool describecluster
# Node liveness. A DN node cannot apply schema changes.
nodetool status
# Pending schema mutations. In 3.x look for MigrationStage; in 4.x, MIGRATION.
nodetool tpstats | grep -i migration
# Gossip reachability. If the outlier is dead, schema cannot propagate.
nodetool gossipinfo
# Active streaming or topology changes that may delay schema application.
nodetool netstats
Pay attention to the last line of nodetool describecluster. It lists UNREACHABLE nodes separately from the schema version groupings. An UNREACHABLE node is a gossip issue that must be resolved before schema reconciliation can succeed.
How to diagnose it
Confirm the symptom is sustained. Run
nodetool describecluster. If multiple versions appear, wait five minutes and run it again. If the cluster has just executed DDL, give gossip time to propagate. If it resolves within the window, stop.Identify the outlier. The output groups nodes by schema UUID. Note which IP addresses are attached to the older version.
Check node liveness. Run
nodetool status. If the outlier isDN(Down Normal), the root cause is node failure or network partition, not a schema bug. Recover the node first. Once it rejoins the ring and gossip stabilizes, it should pull the latest schema automatically.Check for UNREACHABLE nodes. In
nodetool describecluster, the last line may list UNREACHABLE endpoints. These nodes are not responding to gossip. Restore gossip connectivity before expecting schema convergence.Inspect the Migration stage. Run
nodetool tpstatsand look for the Migration stage. IfPendingis greater than zero and not decreasing, the node has queued schema mutations that are not being processed. Correlate with GC pauses, high CPU, or disk saturation on the outlier.Determine if a rolling upgrade is in progress. If nodes are running different Cassandra versions, schema disagreement is expected because nodes on different versions may not stream schema to each other. Do not run
nodetool resetlocalschemain a mixed-version cluster unless the disagreement persists after all nodes are on the same version.
Metrics and signals to monitor
| Signal | Why it matters | Warning sign |
|---|---|---|
| Schema version count | Direct indicator of cluster-wide agreement | >1 UUID sustained for >5 minutes |
| Migration stage pending tasks | Schema mutations queue here before application | Pending > 0 sustained or growing |
Node liveness (nodetool status) | DOWN nodes cannot receive or apply mutations | Any node DN or UJ longer than expected |
| Gossip unreachable members | Nodes that cannot be reached for schema sync | UNREACHABLE endpoints in describecluster |
| GC pause duration | Long pauses can stall the migration stage and gossip | Pauses > 2 seconds sustained |
Fixes
Outlier node is DOWN or unreachable
Do not attempt schema repair on a node that is not fully in the ring. Bring the node back to UN (Up Normal) first. Once it is reachable via gossip, schema convergence usually happens automatically. If the node returns but still shows the old schema UUID after five minutes, proceed to the next fix.
Migration backlog or hung schema pull on a live node
If the outlier is UP but refuses to converge, run nodetool resetlocalschema on that node. This drops the local node’s schema tables and repopulates them by pulling the current schema from a gossip peer. It is safe to run on a live node, but it forces a full resync and should not be your first reaction to transient disagreement. The node must reach peers via gossip for the pull to succeed.
After running the command, wait up to five minutes and re-check nodetool describecluster. Most cases resolve here.
Persistent disagreement after resetlocalschema
If the cluster still shows multiple versions, perform a graceful rolling restart of the outlier. First, drain the node. This disables native transport and stops client traffic on that node.
nodetool drain
Then restart the Cassandra process. drain flushes memtables and commitlog segments before shutdown, which is safer than a hard kill. After the node rejoins, verify nodetool describecluster shows a single version.
Rolling upgrade scenario
If schema disagreement appears during a rolling upgrade, finish upgrading all nodes before attempting any schema repair. Mixed-version clusters inherently may not agree on schema format. Once every node is on the same version, the disagreement should resolve automatically. If it does not, only then consider nodetool resetlocalschema on the remaining outliers.
Prevention
- Serialize DDL operations. Do not issue a new
CREATE,ALTER, orDROPuntilnodetool describeclusterreturns exactly one schema version. Parallel or rapid-fire DDL is a common cause of spurious disagreement. - Check
nodetool statusbefore running DDL. If any node isDN,UJ, or in a non-NORMAL state, wait until the cluster is fully stable. - During rolling restarts or upgrades, pause DDL until the topology is fully stable and all nodes report
UN. - Monitor the Migration stage pending task count via
nodetool tpstatsor JMX. A growing queue is an early warning that schema mutations are stalling.
How Netdata helps
Netdata collects the SchemaVersions map via JMX from org.apache.cassandra.db:type=StorageService and exposes whether the cluster is in agreement. You can correlate schema disagreement with node liveness and gossip health.
- Alerts on sustained schema disagreement (> 5 minutes) without requiring manual
nodetoolchecks. - Correlates schema splits with unreachable gossip members to distinguish node failure from migration stalls.
- Tracks Migration stage pending tasks via JMX thread pool metrics to catch backlog before it blocks DDL.
- Surfaces the signal alongside GC pause duration and thread pool saturation, helping you determine whether a migration stall is caused by JVM pressure or I/O blocking.
Related guides
- Cassandra compaction strategies: STCS vs LCS vs TWCS vs UCS
- Cassandra compaction death spiral: when writes outrun compaction throughput
- Cassandra consistency levels explained: QUORUM, ONE, LOCAL_QUORUM, and EACH_QUORUM
- Cassandra zombie data resurrection: gc_grace_seconds and unrepaired tombstones
- Cassandra disk space exhaustion: emergency recovery when the data volume fills
- Cassandra dropped mutations: silent write loss and load shedding
- Cassandra dropped reads and other messages: reading nodetool tpstats Dropped
- Cassandra GC death spiral: long pauses, gossip flapping, and recovery
- Cassandra GC pauses too long: diagnosing G1 stop-the-world pauses
- Cassandra gossip flapping: nodes bouncing UP and DOWN
- Cassandra heap pressure: sizing the JVM heap and tuning G1GC
- Cassandra monitoring checklist: the signals every production cluster needs







