Cassandra Too many open files: file descriptor exhaustion
Your application starts logging connection timeouts or Unable to connect errors, yet nodetool status still marks the node as UP. Inside the Cassandra system log, you see java.io.IOException: Too many open files or compaction tasks failing with Cannot open enough files. This is file descriptor exhaustion. The node operates normally until it hits the process ulimit, then it cannot open new SSTables, accept client sockets, or maintain internode connections. The result is a node that looks alive in gossip but is functionally unable to serve reads, writes, or background compaction.
What this means
Linux enforces a per-process open file descriptor limit. Each SSTable opens approximately six file handles for its component files (Data.db, Index.db, Filter.db, Summary.db, Statistics.db, CompressionInfo.db). Every client and internode connection consumes an additional descriptor. On a node with thousands of SSTables and hundreds of active connections, FDs accumulate rapidly.
Most Linux distributions ship with a default soft limit of 1024. When the JVM’s OpenFileDescriptorCount reaches MaxFileDescriptorCount, the process cannot open new files or sockets. Memtable flushes and compaction stall because new SSTables cannot be created. Native transport rejects new CQL connections. Internode sockets may fail, causing gossip heartbeats to miss and peers to mark the node DOWN. Unlike CPU or heap pressure, FD exhaustion is a hard stop.
flowchart TD
A[High SSTable count] --> B[6 FDs per SSTable]
C[Client and internode connections] --> D[Total FD usage grows]
B --> D
D --> E{FD usage >= ulimit}
E -->|Yes| F[Too many open files]
F --> G[Cannot open new SSTables]
F --> H[Native transport rejects connections]
F --> I[Gossip sockets fail]
I --> J[Node marked DOWN]Common causes
| Cause | What it looks like | First thing to check |
|---|---|---|
| Compaction backlog | SSTable count grows over days; read latency climbs; FD usage rises steadily | nodetool compactionstats and per-table SSTable count |
| Client connection leak | FD count rises without SSTable growth; many sockets in CLOSE_WAIT or from a single client IP | ss -tanp state close-wait or lsof filtered by client IP |
| ulimit misconfiguration | Max open files at 1024 or 4096; exhaustion happens under moderate load | /proc/<pid>/limits |
| Streaming during bootstrap or repair | FD spike correlates with topology change or repair session streaming many SSTables | nodetool netstats |
Quick checks
These commands are read-only and safe to run on a live node.
# Verify the current limit and usage for the Cassandra process
cat /proc/$(pgrep -f CassandraDaemon)/limits | grep "Max open files"
ls /proc/$(pgrep -f CassandraDaemon)/fd | wc -l
# Estimate FD consumption without lsof overhead
# Symlinks to *.db files are SSTables; symlinks to socket:[inode] are network connections
readlink /proc/$(pgrep -f CassandraDaemon)/fd/* | grep -c "\.db$"
readlink /proc/$(pgrep -f CassandraDaemon)/fd/* | grep -c "socket:"
# If you use lsof, be aware it can spike CPU on FD-heavy processes
lsof -n -p $(pgrep -f CassandraDaemon) | grep -c "\.db$"
lsof -n -p $(pgrep -f CassandraDaemon) | grep -c "TCP"
# Inspect SSTable count per table
nodetool tablestats <keyspace> | grep "SSTable count"
# Look for connection leaks via CLOSE_WAIT sockets
ss -tan state close-wait | wc -l
# Check JMX FD counters directly
nodetool info | grep "Open File Descriptors"
How to diagnose it
- Confirm the limit. Run
cat /proc/<cassandra_pid>/limitsand look forMax open files. If the hard limit is 4096 or lower, the node is misconfigured regardless of current usage. - Measure current usage. Count entries in
/proc/<pid>/fd/or readOpenFileDescriptorCountfrom JMX vianodetool info. If usage is within 20% of the limit, the node is in the danger zone. - Determine what is consuming FDs. Use
readlink /proc/<pid>/fd/*to compare.dbfiles against network sockets without the overhead oflsof. If.dbfiles dominate, the root cause is SSTable accumulation. If TCP sockets dominate, look for a connection leak or driver misconfiguration. - Check for CLOSE_WAIT. A high count of
CLOSE_WAITsockets indicates that the JVM is not releasing closed connections, which is a classic leak signature. Filterss -tanpby the Cassandra PID to confirm the sockets belong to the process. - Correlate with compaction. Run
nodetool compactionstats. If pending tasks are high and SSTable counts are growing, compaction backlog is driving FD growth. If SSTable counts are stable but FDs are climbing, the cause is almost certainly connections. - Check the timeline. Did FD usage spike during a bootstrap, repair, or application deployment? Streaming operations temporarily increase SSTable counts, and client deployments can introduce connection pool bugs.
nodetool netstatsshows active streams.
Metrics and signals to monitor
| Signal | Why it matters | Warning sign |
|---|---|---|
OpenFileDescriptorCount / MaxFileDescriptorCount | Direct measure of FD headroom | Ratio > 0.8 sustained |
| SSTable count per table | Each SSTable opens ~6 FDs | Count growing over multiple days |
| Connected native clients | Each connection consumes one FD | Count increasing without traffic increase |
| Compaction pending tasks | Backlog prevents SSTable merging | Pending trending upward over 4+ hours |
| Node liveness (gossip) | FD exhaustion can prevent internode sockets from functioning | Node flaps or is marked DOWN while process is alive |
Fixes
Immediate relief
If the node is currently exhausted and rejecting connections, you need to buy time without making the problem worse.
- Disable native transport to stop new connection attempts:
nodetool disablebinary. This drops existing client connections and prevents further FD consumption from connection retries. Re-enable withnodetool enablebinaryafter the root cause is resolved. This is disruptive to clients. - Stop misbehaving clients. If a specific application or IP is leaking connections, block that traffic at the firewall or load balancer. Dropping the client-side process is often faster than waiting for socket timeouts.
- Trigger targeted compaction only if disk space and IOPS allow.
nodetool compact <keyspace> <table>reduces SSTable count, which frees FDs. Warning: this is I/O intensive, temporarily requires additional disk space, and should be run during low traffic.
Raise the ulimit
A production Cassandra node should have a file descriptor limit of at least 100,000. Changes require restarting the Cassandra process.
- For systemd-managed hosts, set
LimitNOFILE=100000in a service drop-in such as/etc/systemd/system/cassandra.service.d/override.conf, then runsystemctl daemon-reloadand restart Cassandra. Verify the new limit in/proc/<pid>/limitsafter startup. - For init.d or manual launches, ensure the ulimit is raised in the startup script or via
/etc/security/limits.conffor the Cassandra user. Be aware that systemd ignoreslimits.conffor services it manages. - For containerized deployments, pass
--ulimit nofile=100000:100000to Docker or Podman. Kubernetes inherits limits from the container runtime or node configuration by default.
Reduce SSTable count
If FDs are consumed by .db files, compaction is not keeping up.
- Increase
compaction_throughput_mb_per_sectemporarily so compaction can catch up. Monitor disk I/O to ensure reads are not starved. - If a specific table has an anomalously high SSTable count, run a targeted compaction on that table during a maintenance window.
- Review the compaction strategy. STCS can accumulate many SSTables under heavy write load. If read amplification is chronically high, evaluate whether LCS or TWCS is more appropriate for the workload.
Fix connection leaks
If TCP sockets dominate FD usage:
- Audit client driver connection pool limits. Many drivers expose settings such as
max_connections_per_hostor local/remote pool size, but the exact parameter names are driver-specific. - Verify that application code properly closes connections and sessions, especially in exception paths.
- Check for client-side retry storms. A retry loop that opens a new connection per attempt will exhaust FDs rapidly. Look for rising
DroppedMessagesinnodetool tpstatsas a side effect.
Prevention
- Never deploy with default ulimits. Verify that the Cassandra process starts with
nofile >= 100000. Verify this inside/proc/<pid>/limitsafter startup, not just in the shell environment. - Monitor FD usage ratio continuously. Alert when
OpenFileDescriptorCount / MaxFileDescriptorCountexceeds 0.8. - Monitor SSTable count trends. Even with a high ulimit, unbounded SSTable growth will eventually exhaust any limit. Treat rising SSTable counts as a leading indicator.
- Monitor client connection counts. A connection count that grows without a corresponding throughput increase is a leak.
- Schedule compactions and repairs carefully. Background streaming can spike SSTable counts. Ensure the node has FD headroom before starting large topology changes.
How Netdata helps
Netdata collects OpenFileDescriptorCount and MaxFileDescriptorCount from the JVM OperatingSystem MBean and surfaces FD usage as a percentage.
Correlate FD saturation with SSTable count, compaction pending tasks, and thread pool blocked tasks on the same dashboard to distinguish compaction backlog from connection leaks.
Track client request latency and dropped message rates alongside FD metrics to detect the failure pattern before the node stops accepting connections.
Alert on FD usage percentage and on connection counts that grow while request throughput stays flat.
Related guides
- Cassandra adding and removing nodes safely: vnodes, tokens, and cleanup
- Cassandra node stuck in joining (UJ): bootstrap diagnosis
- Cassandra compaction strategies: STCS vs LCS vs TWCS vs UCS
- Cassandra clock skew: how NTP drift silently corrupts data
- Cassandra commitlog disk full: segment exhaustion and forced flushes
- Cassandra commitlog pending tasks: write-path I/O pressure
- Cassandra compaction death spiral: when writes outrun compaction throughput
- Cassandra consistency levels explained: QUORUM, ONE, LOCAL_QUORUM, and EACH_QUORUM
- Cassandra zombie data resurrection: gc_grace_seconds and unrepaired tombstones
- Cassandra disk space exhaustion: emergency recovery when the data volume fills
- Cassandra dropped mutations: silent write loss and load shedding
- Cassandra dropped reads and other messages: reading nodetool tpstats Dropped







