Elasticsearch cluster health yellow: unassigned replicas vs real allocation blocks

GET /_cluster/health returning status: yellow means all primaries are assigned but at least one replica is not. Some clusters are yellow by design. Others are yellow because a disk cascade, stuck allocator, or failed shard is blocking recovery. Benign yellow resolves. Structural yellow persists. If your cluster has been yellow for more than thirty minutes, something is actively blocking allocation. Teams that treat yellow as cosmetic often miss the transition from transient recovery to a real incident, leaving them one node failure away from red.

Elasticsearch uses yellow to mean that every primary shard is assigned and searchable, but the replica set is incomplete. A single unassigned replica on any index turns the entire cluster yellow. That replica might be unassigned because the allocator is waiting for a node to return, because a disk watermark has frozen placement, because allocation is disabled, or because the only valid shard copy failed validation and exhausted its retry budget. Knowing which one requires looking past the color to the allocator’s reasoning.

What this means

Green, yellow, and red are allocation states, not performance scores. Green means every primary and every replica is assigned. Yellow means all primaries are assigned and at least one replica is not. Red means at least one primary is unassigned.

Yellow is permanent by design on single-node clusters with replicas configured. Elasticsearch refuses to place a replica on the same node as its primary. With one data node and number_of_replicas greater than zero, those replicas have nowhere to go.

Yellow is also transient during rolling restarts and node recoveries. When a data node leaves, the allocator waits for index.unassigned.node_left.delayed_timeout (default one minute) before allocating replacement replicas, in case the node returns. During that window the cluster is yellow but operational. Active recoveries often resolve quickly, but large shards or heavy indexing can extend the window.

Sustained yellow beyond the expected recovery window means the allocator has tried and failed to place replicas. Common blockers include disk watermarks (low at 85%, high at 90%, flood stage at 95% by default), allocation filtering or awareness attributes that leave no valid target, manually disabled allocation, and shards that failed allocation and exhausted their maximum retry count.

flowchart TD
    A[Cluster health: yellow] --> B{How many data nodes?}
    B -->|One| C[Set replicas to 0 or add nodes]
    B -->|Multiple| D{Yellow sustained >30 min?}
    D -->|No| E[Transient: recovery or node-left delay]
    D -->|Yes| F{Disk above 85%?}
    F -->|Yes| G[Disk watermark blocker]
    F -->|No| H{Allocation explain says?}
    H -->|ALLOCATION_FAILED| I[Retry failed or fix corrupt shard]
    H -->|THROTTLED| J[Wait: heavy recovery load]
    H -->|NO_VALID_SHARD_COPY| K[Restore from snapshot]
    H -->|same node as primary| L[Single-node structural yellow]

Common causes

CauseWhat it looks likeFirst thing to check
Single-node cluster with replicasPermanent yellow; replicas exceed available data nodesGET /_cluster/health?filter_path=number_of_nodes,number_of_data_nodes
Rolling restart or node recoveryYellow during restart, resolving as shards recoverGET /_cat/recovery?v&active_only and node uptime
Disk watermark blockerOne or more nodes above 85% disk; allocator stops placing shards thereGET /_cat/allocation?v sorted by disk percent
Allocation filtering or awareness misconfigurationFree nodes exist but replicas stay unassignedGET /_cluster/allocation/explain
Max retries exhausted on corrupt shardShard state ALLOCATION_FAILED; will never self-healGET /_cat/shards?v&h=index,shard,prirep,state,unassigned.reason
Allocation disabled cluster-wideAllocation disabled or restricted; replica or rebalancing blockedGET /_cluster/settings?filter_path=*.cluster.routing.allocation.enable

Quick checks

These commands are read-only and safe to run against a production cluster.

# Check cluster health and node counts
curl -s 'http://localhost:9200/_cluster/health?filter_path=status,number_of_nodes,number_of_data_nodes,unassigned_shards,relocating_shards'

# List nodes with disk usage and heap
curl -s 'http://localhost:9200/_cat/nodes?v&h=name,node.role,heap.percent,cpu,load_1m,disk.used_percent'

# Check disk usage per node relative to shard data
curl -s 'http://localhost:9200/_cat/allocation?v'

# List unassigned shards and their reasons
curl -s 'http://localhost:9200/_cat/shards?v&h=index,shard,prirep,state,unassigned.reason&s=state' | grep UNASSIGNED

# Ask the allocator why the first unassigned shard is stuck
curl -s 'http://localhost:9200/_cluster/allocation/explain?pretty'

# Check whether allocation is disabled globally
curl -s 'http://localhost:9200/_cluster/settings?include_defaults=true&filter_path=*.cluster.routing.allocation.enable'

# Check active recoveries to distinguish transient from stuck
curl -s 'http://localhost:9200/_cat/recovery?v&active_only&h=index,shard,time,type,stage,source_host,target_host,bytes_percent'

How to diagnose it

  1. Confirm the cluster is yellow, not red. Run GET /_cluster/health. If status is red, the problem is unassigned primaries, which is a different incident. Stop and handle data availability first.

  2. Check node count and roles. number_of_data_nodes tells you whether you are in the single-node structural yellow case. If number_of_data_nodes is one and replicas are configured, yellow is permanent by design.

  3. Look for active recoveries. GET /_cat/recovery?v&active_only shows whether shards are currently copying. If recoveries are active and progressing, yellow is likely transient. If no recoveries are active and the cluster has been yellow for more than thirty minutes, there is a structural blocker.

  4. Inspect disk watermarks. GET /_cat/allocation?v shows disk.percent per node. If any data node is at or above 85%, the allocator will not place new shards on that node. If the node is above 90%, Elasticsearch actively relocates shards away. Above 95%, Elasticsearch marks indices with shards on that node as read-only with index.blocks.read_only_allow_delete.

  5. Run allocation explain. GET /_cluster/allocation/explain returns the allocator’s reasoning for the first unassigned shard. The response tells you whether placement is delayed, throttled, blocked by a hard rule, or impossible due to missing or corrupt shard data.

  6. Check for allocation blocks. GET /_cluster/settings?include_defaults=true&filter_path=*.cluster.routing.allocation.enable shows whether allocation is restricted. Values other than all limit which shards can allocate. For example, primaries allows primary allocation but blocks replicas. Also verify whether any indices carry a flood-stage read-only block.

  7. Check for exhausted retries. If GET /_cat/shards shows a state of ALLOCATION_FAILED and the cluster has been stable for some time, the shard has exceeded its maximum allocation retries. It will stay unassigned until you manually trigger a retry.

Metrics and signals to monitor

SignalWhy it mattersWarning sign
Cluster health statusBinary allocator state indicatoryellow sustained longer than 30 minutes
Unassigned shard countQuantifies the scope of the problemAny unassigned primary; unassigned replicas persisting beyond recovery windows
Disk usage vs. watermarksLeading cause of blocked replica placementAny data node above the 85% low watermark
Node countNode loss triggers reallocation and yellow statesUnexpected drop in number_of_data_nodes
Shard recovery progressDistinguishes transient recovery from a stuck shardRecovery stage unchanged for more than 30 minutes
cluster.routing.allocation.enableManual setting can block replica or primary allocationValue set to anything other than all outside of maintenance

Fixes

Single-node structural yellow. If you have one data node and replicas configured, either add data nodes or set index.number_of_replicas: 0. Reducing replicas removes redundancy, but on a single node there is no redundancy anyway.

Warning: The following command modifies all indices immediately.

curl -X PUT 'http://localhost:9200/_all/_settings' -H 'Content-Type: application/json' -d '{"index.number_of_replicas": 0}'

Disk watermark blocker. Free disk space by deleting old indices, forcemerging, shrinking indices, or adding nodes. If a node crossed the flood-stage watermark (95%), Elasticsearch marks indices with shards on that node as read-only. After you free disk below the high watermark (90%), Elasticsearch automatically removes the read-only block. You can also clear it manually.

Warning: The following command targets all indices. Run it only after you have freed enough disk space to avoid the block reapplying immediately.

curl -X PUT 'http://localhost:9200/_all/_settings' -H 'Content-Type: application/json' -d '{"index.blocks.read_only_allow_delete": null}'

Allocation disabled. Re-enable allocation if it was left disabled after maintenance.

curl -X PUT 'http://localhost:9200/_cluster/settings' -H 'Content-Type: application/json' -d '{"persistent":{"cluster.routing.allocation.enable":"all"}}'

Max retries exhausted. If a shard is in ALLOCATION_FAILED, trigger a retry.

Warning: This retries every failed shard simultaneously and can trigger a burst of recovery traffic. It may fail again if the underlying corruption or disk issue persists.

curl -X POST 'http://localhost:9200/_cluster/reroute?retry_failed=true'

Awareness or filtering misconfiguration. If allocation explain shows that no node matches required attributes, correct the index.routing.allocation settings or add nodes with the required attributes. Forcing a replica onto a node that violates awareness rules risks placing both copies in the same failure domain.

Prevention

Monitor per-node disk usage independently. Cluster-wide disk averages hide hot nodes. A node at 87% disk with replicas waiting to allocate there will keep the cluster yellow even if the cluster average is 60%. Alert when any data node crosses 80%.

Do not silence yellow alerts permanently. Single-node dev clusters aside, sustained yellow should be a ticket-level alert. Normalizing yellow trains operators to ignore the state that precedes disk cascades and replica starvation.

Size replica counts to node counts. If you run with number_of_replicas: 1, you need at least two data nodes. If you run with number_of_replicas: 2, you need at least three. Replicas that have no valid target node waste disk space on the primary node and keep the cluster permanently yellow.

Check allocation explain during incidents. After a node departs, check GET /_cluster/allocation/explain once the node-left delay has expired. If the reason is throttled, wait. If the reason is a hard block due to disk or awareness, act immediately.

Review allocation settings after maintenance. Rolling restarts and cluster maintenance sometimes leave cluster.routing.allocation.enable set to none or primaries. Verify it is back to all before considering the maintenance complete.

How Netdata helps

  • Correlates cluster health status with per-node disk usage to distinguish watermark blockers from transient restarts.
  • Tracks unassigned shard count and node count over time, making it easy to spot when yellow outlasts the expected recovery window.
  • Surfaces per-node disk percentages so you can identify the blocking node before the cluster hits flood stage.
  • Monitors JVM heap and GC on data nodes during recovery storms to detect when rebalancing pushes nodes toward heap pressure.
  • Alerts on sustained yellow cluster status with configurable duration thresholds, avoiding noise from brief rolling-restart windows.