Cassandra heap pressure: sizing the JVM heap and tuning G1GC
Cassandra runs as a single JVM process per node. Every write path allocation, memtable mutation, read merge buffer, and cache entry lives on the heap. When the heap is undersized or GC is left at JVM defaults, stop-the-world pauses freeze gossip, client requests, and compaction. A pause longer than roughly 18 seconds (the default phi accrual threshold is 8) causes peers to mark the node DOWN, which triggers hinted handoff, replay storms, and client retries that worsen memory pressure: the GC death spiral.
Sizing the heap requires staying under the compressed-oops ceiling, accounting for off-heap memory, and understanding which G1GC flags Cassandra overrides and why. The sections below cover sizing rules for Cassandra 4.x and 5.x, the G1GC defaults in jvm11-server.options, and when to evaluate ZGC or Shenandoah.
What heap pressure means and why it matters
Heap pressure is the gap between allocation rate and reclamation rate. Writes allocate memtable objects; reads allocate temporary merge buffers from multiple SSTables; compaction allocates temporary structures while merging files. The young generation collects short-lived objects quickly, but survivors promote to the old generation.
If the old generation fills faster than mixed or full GC cycles reclaim it, the post-GC heap floor trends upward. When the floor crosses about 75 percent of the maximum heap, full GCs become frequent and longer. At around 85 percent, the JVM enters a continuous collection loop: it pauses for seconds, reclaims almost nothing, and immediately pauses again. During these pauses the node cannot gossip, so the failure detector marks it DOWN. On recovery, hint replay and client retries drive even more allocation. Without intervention the node becomes effectively unavailable.
How Cassandra sizes the heap by default
Cassandra’s default auto-calculation for MAX_HEAP_SIZE is max(min(1/2 ram, 1024MB), min(1/4 ram, 8GB)). On a 32 GB node this yields 8 GB; on a 128 GB node it also yields 8 GB. Override this in cassandra-env.sh or the version-specific JVM options file.
Lock the heap size by setting -Xms equal to -Xmx. Heap resizing triggers stop-the-world events. The widely accepted sweet spot is 8 to 16 GB. This keeps G1GC pauses short while leaving most RAM for off-heap structures and the OS page cache. The Cassandra 4.0 hardware docs recommend a heap no smaller than 2 GB and no larger than 50 percent of system RAM.
The compressed-oops ceiling and the 31 GB rule
Below roughly 32 GB the JVM uses compressed ordinary object pointers (compressed oops), which cuts pointer size in half and increases usable metadata space per gigabyte. If you set the heap to exactly 32 GB, the JVM may silently disable compressed oops, and your usable space can drop despite the larger nominal heap. The practical maximum is 31 GB.
Verify the behavior for your exact JDK build before deploying:
# Verify compressed oops mode at target heap size
java -Xmx32767M -XX:+UnlockDiagnosticVMOptions -XX:+PrintCompressedOopsMode -version
If the output shows CompressedOops is disabled, reduce the heap by 256 MB increments until it is active.
G1GC defaults and Cassandra-specific overrides
G1GC is the default collector for Cassandra 4.x on Java 11 and newer. The flags in Cassandra’s jvm11-server.options and common production overrides include:
-XX:+UseG1GC-XX:+ParallelRefProcEnabled-XX:MaxTenuringThreshold=2-XX:G1HeapRegionSize=16m-XX:+UnlockExperimentalVMOptions-XX:G1NewSizePercent=50-XX:G1RSetUpdatingPauseTimePercent=5-XX:MaxGCPauseMillis=300-XX:InitiatingHeapOccupancyPercent=70
Two overrides deserve special attention. G1NewSizePercent=50 floors the young generation at half the total heap. The JVM default is lower, but Cassandra’s write-heavy workload generates large bursts of short-lived memtable data. Without this floor, young gen can shrink under load and force premature promotion to old gen, which triggers expensive mixed collections.
MaxGCPauseMillis=300 sets the target pause higher than the JVM default of 200 ms. Cassandra’s developers have found 300 ms to be the lowest viable setting for typical workloads; lower values cause G1 to over-aggressively reclaim, starving throughput. InitiatingHeapOccupancyPercent=70 starts the concurrent marking cycle when the old generation reaches 70 percent occupancy, giving G1 enough runway to plan a mixed collection before the old gen is completely full.
Off-heap memory: what is not in the heap
The JVM heap is only one consumer of system RAM. Cassandra stores bloom filters (off-heap by default since 3.x), compression offset maps, row and key caches, and direct Netty buffers outside the heap. If you configure off-heap memtables via memtable_offheap_space_in_mb, that also lives outside the heap. Cassandra 4.0+ adds a chunk cache that replaces OS page cache reliance, and it too consumes native memory.
flowchart TD
A[System RAM] --> B[JVM Heap]
A --> C[Off-heap structures]
A --> D[OS page cache]
B --> E[G1 Young Gen]
B --> F[G1 Old Gen]
E --> G[Short-lived objects]
F --> H[Long-lived metadata]
H --> I[Post-GC heap floor]
I --> J{Floor > 75% max?}
J -->|Yes| K[GC death spiral]
J -->|No| L[Stable operation]
C --> M[OOM risk if RSS > RAM]If you size the heap to 50 percent of RAM and off-heap structures consume another 20 percent, the Linux OOM killer can terminate Cassandra even when nodetool info reports a comfortable 60 percent heap usage. Budget heap to no more than roughly half of system RAM and monitor process RSS independently.
When the default G1GC configuration is not enough
Post-GC heap floor trending
The most reliable leading indicator of impending trouble is the heap floor: the used heap immediately after a full or mixed old-generation collection. If this floor rises steadily over days, your allocation rate has exceeded sustainable reclamation. The fix is rarely a single knob. You may need to increase the heap (if you are below 16 GB), reduce table-level caches, tune compaction to reduce temporary allocation, or reduce application batch sizes. Track the floor from GC logs or JMX java.lang:type=GarbageCollector beans. If the floor rises by more than 5 percent per week, investigate memory pressure sources such as large partitions via nodetool toppartitions.
Humongous allocations
G1 treats objects larger than 50 percent of a region size as humongous. In Cassandra’s configuration with 16 MB regions, that threshold is 8 MB, but operational guidance recommends keeping individual cell or object sizes under 4 MB to avoid humongous allocation paths that bypass normal incremental collection and can force full GCs.
CMS deprecation
Cassandra 3.x defaulted to the Concurrent Mark Sweep collector. CMS was deprecated in Java 9 and removed entirely in Java 14 (JEP 363). Cassandra 4.0 and later ship G1GC as the default. Do not attempt to use CMS on any supported modern JDK.
ZGC and Shenandoah
For operators running Cassandra 5.x on Java 17 or newer, ZGC and Shenandoah are available alternatives.
- ZGC does not use compressed oops, so the 32 GB ceiling does not apply. It excels at keeping pauses under 10 ms, but benchmarks have shown it caps out at roughly 41k ops/s under heavy load compared with roughly 50k ops/s for G1GC and Shenandoah. Consider ZGC when tail-latency sensitivity outweighs raw throughput and you are willing to pay the memory overhead premium.
- Shenandoah is included in upstream OpenJDK and many vendor distributions, though some Oracle JDK builds have historically omitted it. Its overhead is not justified below 4 GB heaps, but on larger heaps it can deliver lower pause times than G1 without ZGC’s throughput penalty. Cassandra 5.x with Java 17 may require additional
--add-opensflags forjava.base/java.ioandjava.base/sun.nio.chdue to Jigsaw module encapsulation.
A reported G1GC performance regression in Cassandra 4.1.5 (September 2025) is under community investigation. If you observe elevated pause times on a recent 4.1 patch release after months of stability, check the Apache dev list before spending weeks tuning flags.
Tradeoffs and when to use what
| Configuration | Best for | Caveats |
|---|---|---|
| 8 to 16 GB heap with G1GC | Most production workloads on Cassandra 4.x and 5.x | Stay under 16 GB if possible; pauses grow with larger heaps |
| 31 GB heap with G1GC | Large datasets where caches and memtables need room | Monitor old-gen pauses carefully; past 16 GB G1 efficiency drops |
| ZGC | Latency-sensitive applications, moderate load, Java 17+ | Lower throughput under heavy load; no compressed oops |
| Shenandoah | Java 17+, low-pause requirement, builds that include Shenandoah | Not beneficial below 4 GB heaps |
| Unequal -Xms and -Xmx | Never | Heap resizing triggers stop-the-world pauses |
Signals to watch in production
| Signal | Why it matters | Warning sign |
|---|---|---|
| Post-GC heap floor | Irreducible old-gen occupancy; a rising floor means a spiral is imminent | Used heap after old GC exceeds 75 percent of max |
| GC pause duration | Stop-the-world freezes gossip and all request processing | Single pause exceeds 500 ms; sustained pauses over 2 seconds risk gossip failure |
| Humongous allocation rate | Humongous objects bypass G1 incremental collection and can trigger full GC | Non-zero humongous allocations in GC logs |
| Off-heap RSS growth | Invisible to JVM metrics; drives Linux OOM kills | Process RSS minus JVM max heap trending toward remaining RAM |
| Memtable flush rate | Forced flushes from heap pressure create small SSTables and compaction debt | Flush count diverges from write throughput |
How Netdata helps
- Correlate JVM heap usage, old-gen occupancy, and GC pause duration with gossip state and dropped-message rates to spot a GC death spiral before nodes are marked DOWN.
- Track process RSS independently of JVM heap to catch off-heap growth that JVM metrics miss.
- Surface post-GC heap floor trends and humongous allocation rates from GC logs or JMX
GarbageCollectorbeans. - Overlay client request latency, thread pool pending tasks, and compaction backlog to distinguish write-burst pressure from large-partition reads or cache bloat.
Related guides
- Cassandra consistency levels explained: QUORUM, ONE, LOCAL_QUORUM, and EACH_QUORUM
- Cassandra GC death spiral: long pauses, gossip flapping, and recovery
- Cassandra monitoring checklist: the signals every production cluster needs
- Cassandra monitoring maturity model: from survival to expert
- How Cassandra actually works in production: a mental model for operators







