$ guides / elasticsearch / elasticsearch-cold-page-cache-after-restart ▌

Operations Guides

Elasticsearch slow search after restart: cold OS page cache and warmup

Restart an Elasticsearch node and searches that normally return in tens of milliseconds now take seconds. CPU stays low, disk read throughput spikes, and iowait climbs. This is not a failing disk, a runaway query, or a JVM heap problem. It is a cold OS page cache.

Elasticsearch relies on the OS filesystem cache to serve search requests from Lucene segment files. After any restart, that cache is empty. The kernel must read segments from disk into memory on demand. Until the working set is resident, queries incur disk I/O that should have been cache hits. Depending on the ratio of dataset size to available RAM, warmup can last minutes to hours.

Elasticsearch 7.0 introduced a second mechanism that produces nearly identical symptoms with a different fix. Shards that receive no search or GET request for index.search.idle.after (default 30s) enter an idle state. The next query triggers a synchronous refresh before executing, adding latency that can be mistaken for cache coldness. Telling the two apart determines whether you wait, tune a setting, or query differently.

What this means

Elasticsearch keeps index data out of the JVM heap. The heap holds segment metadata, query structures, and caches; the OS page cache holds the actual inverted indices and stored fields. This design makes the page cache the single most important resource for search performance.

After a restart, the page cache is empty. The first queries to each shard trigger sequential disk reads. Because the query itself is not CPU-intensive, the node sits with low CPU and high disk wait until the kernel caches the relevant segment files. Latency can be 10-100x higher than normal during this window. Search is scatter-gather across target shards, so the slowest cold shard sets overall latency.

The idle-shard refresh behavior is separate and per-shard. When a shard goes idle, Elasticsearch stops background refreshes to save indexing overhead. The next search must wait for a refresh to complete. This does not require a restart, but it is often first noticed after a restart when traffic resumes unevenly across shards.

flowchart TD
    A[High search latency after restart] --> B{Low CPU and high disk reads?}
    B -->|Yes| C[Cold OS page cache]
    B -->|No| D{Slow log shows refresh wait?}
    D -->|Yes| E[Idle shard refresh block]
    D -->|No| F[Investigate queries segments heap]

Common causes

Cause	What it looks like	First thing to check
Cold OS page cache after restart	Query and fetch latency elevated, low CPU, high disk I/O, no slow log entries	`free -w` or `/proc/meminfo` for cached memory, plus `_cat/allocation` for segment store size
Idle-shard refresh blocking (Elasticsearch 7.0+)	First search after idle period is slow; subsequent queries on the same shard are fast	`GET /<index>/_settings/index.search.idle.after` and slow logs
Excessive Linux readahead thrashing the cache	Sustained high disk throughput but poor cache residency, common on LVM or RAID	`lsblk` or `blockdev` readahead values
Insufficient RAM for the page cache	Chronic high latency even after warmup; node memory is too small for the working set	Compare total index size on node to available system memory after heap

Quick checks

# OS page cache and memory layout
free -w

# Elasticsearch node-level OS memory stats
curl -s 'http://localhost:9200/_nodes/stats/os?filter_path=nodes.*.os.mem'

# Index settings that affect warmup and idle refresh
curl -s 'http://localhost:9200/<index>/_settings?filter_path=**.index.store.preload,**.index.search.idle.after'

# Slow queries with refresh waits (path and format depend on log4j2 configuration)
grep "took" /var/log/elasticsearch/*_slowlog.log | tail -20

# Linux readahead settings for block devices
lsblk -o NAME,RA,MOUNTPOINT,TYPE,SIZE

# Segment store size per node to estimate working set
curl -s 'http://localhost:9200/_cat/allocation?v&h=node,disk.indices,disk.used,disk.total'

How to diagnose it

Confirm the signature. High search latency with low CPU and elevated disk reads points to I/O-bound cold data. If CPU is high, look elsewhere: expensive queries, merge storms, or segment explosion.
Check the slow log. Cold page cache misses do not appear in the slow log because the query plan and execution are fast; only the disk read is slow. If the slow log shows entries with refresh wait times, the cause is idle-shard synchronous refresh rather than cold cache.
Verify page cache headroom. On the node OS, compare cached memory (free -w or /proc/meminfo) to the total size of segment files the node holds. If the dataset is larger than RAM, the cache will churn. If the dataset fits but cached memory is low, another process may be consuming memory, or the heap may be oversized.
Inspect index.store.preload. If the index is configured to preload specific file extensions at open time, verify the list is narrow. Preloading too many files can evict hot data and degrade search performance when the cache cannot hold the preload plus the working set.
Inspect index.search.idle.after. The default is 30s. If your workload has natural idle periods and latency spikes consistently on the first query after each idle window, this setting is the culprit.
On Linux, verify block device readahead. Values much larger than 128 KiB can cause the kernel to read more data than necessary, polluting the page cache and delaying residency of the actual Lucene segments.

Metrics and signals to monitor

Signal	Why it matters	Warning sign
Search latency (query and fetch phases)	Primary user-facing symptom	Sustained >5x baseline after restart
OS CPU percent	Distinguishes CPU-bound from I/O-bound search	Low CPU despite high latency
Disk I/O wait and read throughput	Confirms disk-bound reads from cold cache	`iowait` >30% or sustained high read throughput
OS page cache / available memory	Measures cache adequacy for segment files; Elasticsearch does not expose page cache directly	Available memory + cache < working set size, or chronic swapping
`index.search.idle.after`	Idle-shard penalty affects first search after idle	Latency spikes correlated with idle periods >30s
Thread pool search queue	Queuing indicates pressure from slow shards	Queue depth growing while latency is high

Fixes

Cold page cache

Wait. If the cluster is green and the node is healthy, the cache will warm naturally as queries run. This is not a bug and does not require a restart or configuration change.

Run warm-up queries. Before returning a restarted node to the load balancer, issue representative searches against critical indices. Use the same query patterns your application uses so the relevant segment files are loaded into cache. This shifts the latency impact from users to the maintenance window. For large datasets, warmup can still take hours, so plan maintenance windows accordingly.

Use index.store.preload with caution. You can configure an index to eagerly load specific Lucene file types into the filesystem cache when the index is opened. For example, ["cfs", "dvm", "tim"]. However, preloading too many files on too many indices will evict hot data and make search slower if the cache cannot hold everything. Reserve this for small, critical indices.

Ensure adequate RAM. Elasticsearch recommends leaving at least half of available system memory for the filesystem cache. If the JVM heap consumes too much of the node RAM, the remaining space for the page cache shrinks and warmup becomes ineffective. Keep heap sized appropriately (typically no more than 26-30 GB).

Idle-shard refresh blocking

Tune index.search.idle.after. The default 30s is appropriate for continuously searched indices. For batch or intermittently searched workloads, raising this value (for example, to 3600s) keeps background refreshes active longer, preventing the synchronous refresh penalty on the next query. The tradeoff is slightly higher indexing overhead. Do not raise this on always-on search paths unless you understand the indexing cost.

Keep traffic continuous. Sending a lightweight periodic search or GET request to critical shards prevents them from entering the idle state. This avoids the synchronous refresh penalty entirely.

Linux readahead thrashing

Set readahead to 128 KiB. High readahead values, common on LVM, software RAID, or dm-crypt devices, cause the kernel to read more data than necessary. This pollutes the page cache and delays the loading of the actual Lucene segments.

# Check current readahead
blockdev --getra /dev/<device>

# Set to 128 KiB (256 sectors)
blockdev --setra 256 /dev/<device>

This change takes effect immediately but is lost on reboot. Persist it via udev rules or your init system.

Prevention

Warm-up procedure. After any restart, run a scripted warm-up pass against critical indices before returning the node to the load balancer. This moves the latency cost from production traffic into the maintenance window.

Right-size the heap. Keep the Elasticsearch JVM heap sized appropriately (typically no more than 26-30 GB) so the OS has sufficient remaining RAM for the page cache. An oversized heap starves the cache and makes cold-start recovery longer and less stable.

Monitor page cache headroom. Track OS-level cached memory and available memory alongside the total size of index data on the node. Elasticsearch does not report page cache usage directly. If the gap shrinks over time, you are approaching chronic cache pressure that will amplify any restart.

Avoid aggressive readahead. Configure Linux block devices with a 128 KiB readahead, especially under LVM or RAID. This prevents cache pollution during warmup and steady-state operation.

Tune idle-shard behavior for your workload. If your traffic pattern is naturally bursty and you observe refresh-related latency spikes, adjust index.search.idle.after proactively rather than reacting to user complaints.

How Netdata helps

Correlate Elasticsearch search latency with per-disk iowait and read throughput to confirm a cold-cache bottleneck rather than CPU or heap pressure.
Track system RAM and cached memory to verify page cache headroom after restarts.
Alert on search latency spikes paired with low CPU, which points to cold page cache or idle-shard I/O waits.
Visualize per-node disk I/O to distinguish warmup reads from merge or recovery traffic.

The Netdata solution

Elasticsearch monitoring with Netdata

Netdata monitors Elasticsearch with per-second metrics and ML anomaly detection. Correlate JVM heap pressure, shard counts, disk watermarks, mapping growth, and merge activity with cluster and node health in one view.

See Elasticsearch monitoring → Start monitoring free

Elasticsearch slow search after restart: cold OS page cache and warmup

Elasticsearch slow search after restart: cold OS page cache and warmup

What this means

Common causes

Quick checks

How to diagnose it

Metrics and signals to monitor

Fixes

Cold page cache

Idle-shard refresh blocking

Linux readahead thrashing

Prevention

How Netdata helps

Related guides

Elasticsearch monitoring with Netdata