Cassandra Too many open files: file descriptor exhaustion

Your application starts logging connection timeouts or Unable to connect errors, yet nodetool status still marks the node as UP. Inside the Cassandra system log, you see java.io.IOException: Too many open files or compaction tasks failing with Cannot open enough files. This is file descriptor exhaustion. The node operates normally until it hits the process ulimit, then it cannot open new SSTables, accept client sockets, or maintain internode connections. The result is a node that looks alive in gossip but is functionally unable to serve reads, writes, or background compaction.

What this means

Linux enforces a per-process open file descriptor limit. Each SSTable opens approximately six file handles for its component files (Data.db, Index.db, Filter.db, Summary.db, Statistics.db, CompressionInfo.db). Every client and internode connection consumes an additional descriptor. On a node with thousands of SSTables and hundreds of active connections, FDs accumulate rapidly.

Most Linux distributions ship with a default soft limit of 1024. When the JVM’s OpenFileDescriptorCount reaches MaxFileDescriptorCount, the process cannot open new files or sockets. Memtable flushes and compaction stall because new SSTables cannot be created. Native transport rejects new CQL connections. Internode sockets may fail, causing gossip heartbeats to miss and peers to mark the node DOWN. Unlike CPU or heap pressure, FD exhaustion is a hard stop.

flowchart TD
    A[High SSTable count] --> B[6 FDs per SSTable]
    C[Client and internode connections] --> D[Total FD usage grows]
    B --> D
    D --> E{FD usage >= ulimit}
    E -->|Yes| F[Too many open files]
    F --> G[Cannot open new SSTables]
    F --> H[Native transport rejects connections]
    F --> I[Gossip sockets fail]
    I --> J[Node marked DOWN]

Common causes

CauseWhat it looks likeFirst thing to check
Compaction backlogSSTable count grows over days; read latency climbs; FD usage rises steadilynodetool compactionstats and per-table SSTable count
Client connection leakFD count rises without SSTable growth; many sockets in CLOSE_WAIT or from a single client IPss -tanp state close-wait or lsof filtered by client IP
ulimit misconfigurationMax open files at 1024 or 4096; exhaustion happens under moderate load/proc/<pid>/limits
Streaming during bootstrap or repairFD spike correlates with topology change or repair session streaming many SSTablesnodetool netstats

Quick checks

These commands are read-only and safe to run on a live node.

# Verify the current limit and usage for the Cassandra process
cat /proc/$(pgrep -f CassandraDaemon)/limits | grep "Max open files"
ls /proc/$(pgrep -f CassandraDaemon)/fd | wc -l

# Estimate FD consumption without lsof overhead
# Symlinks to *.db files are SSTables; symlinks to socket:[inode] are network connections
readlink /proc/$(pgrep -f CassandraDaemon)/fd/* | grep -c "\.db$"
readlink /proc/$(pgrep -f CassandraDaemon)/fd/* | grep -c "socket:"

# If you use lsof, be aware it can spike CPU on FD-heavy processes
lsof -n -p $(pgrep -f CassandraDaemon) | grep -c "\.db$"
lsof -n -p $(pgrep -f CassandraDaemon) | grep -c "TCP"

# Inspect SSTable count per table
nodetool tablestats <keyspace> | grep "SSTable count"

# Look for connection leaks via CLOSE_WAIT sockets
ss -tan state close-wait | wc -l

# Check JMX FD counters directly
nodetool info | grep "Open File Descriptors"

How to diagnose it

  1. Confirm the limit. Run cat /proc/<cassandra_pid>/limits and look for Max open files. If the hard limit is 4096 or lower, the node is misconfigured regardless of current usage.
  2. Measure current usage. Count entries in /proc/<pid>/fd/ or read OpenFileDescriptorCount from JMX via nodetool info. If usage is within 20% of the limit, the node is in the danger zone.
  3. Determine what is consuming FDs. Use readlink /proc/<pid>/fd/* to compare .db files against network sockets without the overhead of lsof. If .db files dominate, the root cause is SSTable accumulation. If TCP sockets dominate, look for a connection leak or driver misconfiguration.
  4. Check for CLOSE_WAIT. A high count of CLOSE_WAIT sockets indicates that the JVM is not releasing closed connections, which is a classic leak signature. Filter ss -tanp by the Cassandra PID to confirm the sockets belong to the process.
  5. Correlate with compaction. Run nodetool compactionstats. If pending tasks are high and SSTable counts are growing, compaction backlog is driving FD growth. If SSTable counts are stable but FDs are climbing, the cause is almost certainly connections.
  6. Check the timeline. Did FD usage spike during a bootstrap, repair, or application deployment? Streaming operations temporarily increase SSTable counts, and client deployments can introduce connection pool bugs. nodetool netstats shows active streams.

Metrics and signals to monitor

SignalWhy it mattersWarning sign
OpenFileDescriptorCount / MaxFileDescriptorCountDirect measure of FD headroomRatio > 0.8 sustained
SSTable count per tableEach SSTable opens ~6 FDsCount growing over multiple days
Connected native clientsEach connection consumes one FDCount increasing without traffic increase
Compaction pending tasksBacklog prevents SSTable mergingPending trending upward over 4+ hours
Node liveness (gossip)FD exhaustion can prevent internode sockets from functioningNode flaps or is marked DOWN while process is alive

Fixes

Immediate relief

If the node is currently exhausted and rejecting connections, you need to buy time without making the problem worse.

  • Disable native transport to stop new connection attempts: nodetool disablebinary. This drops existing client connections and prevents further FD consumption from connection retries. Re-enable with nodetool enablebinary after the root cause is resolved. This is disruptive to clients.
  • Stop misbehaving clients. If a specific application or IP is leaking connections, block that traffic at the firewall or load balancer. Dropping the client-side process is often faster than waiting for socket timeouts.
  • Trigger targeted compaction only if disk space and IOPS allow. nodetool compact <keyspace> <table> reduces SSTable count, which frees FDs. Warning: this is I/O intensive, temporarily requires additional disk space, and should be run during low traffic.

Raise the ulimit

A production Cassandra node should have a file descriptor limit of at least 100,000. Changes require restarting the Cassandra process.

  • For systemd-managed hosts, set LimitNOFILE=100000 in a service drop-in such as /etc/systemd/system/cassandra.service.d/override.conf, then run systemctl daemon-reload and restart Cassandra. Verify the new limit in /proc/<pid>/limits after startup.
  • For init.d or manual launches, ensure the ulimit is raised in the startup script or via /etc/security/limits.conf for the Cassandra user. Be aware that systemd ignores limits.conf for services it manages.
  • For containerized deployments, pass --ulimit nofile=100000:100000 to Docker or Podman. Kubernetes inherits limits from the container runtime or node configuration by default.

Reduce SSTable count

If FDs are consumed by .db files, compaction is not keeping up.

  • Increase compaction_throughput_mb_per_sec temporarily so compaction can catch up. Monitor disk I/O to ensure reads are not starved.
  • If a specific table has an anomalously high SSTable count, run a targeted compaction on that table during a maintenance window.
  • Review the compaction strategy. STCS can accumulate many SSTables under heavy write load. If read amplification is chronically high, evaluate whether LCS or TWCS is more appropriate for the workload.

Fix connection leaks

If TCP sockets dominate FD usage:

  • Audit client driver connection pool limits. Many drivers expose settings such as max_connections_per_host or local/remote pool size, but the exact parameter names are driver-specific.
  • Verify that application code properly closes connections and sessions, especially in exception paths.
  • Check for client-side retry storms. A retry loop that opens a new connection per attempt will exhaust FDs rapidly. Look for rising DroppedMessages in nodetool tpstats as a side effect.

Prevention

  • Never deploy with default ulimits. Verify that the Cassandra process starts with nofile >= 100000. Verify this inside /proc/<pid>/limits after startup, not just in the shell environment.
  • Monitor FD usage ratio continuously. Alert when OpenFileDescriptorCount / MaxFileDescriptorCount exceeds 0.8.
  • Monitor SSTable count trends. Even with a high ulimit, unbounded SSTable growth will eventually exhaust any limit. Treat rising SSTable counts as a leading indicator.
  • Monitor client connection counts. A connection count that grows without a corresponding throughput increase is a leak.
  • Schedule compactions and repairs carefully. Background streaming can spike SSTable counts. Ensure the node has FD headroom before starting large topology changes.

How Netdata helps

Netdata collects OpenFileDescriptorCount and MaxFileDescriptorCount from the JVM OperatingSystem MBean and surfaces FD usage as a percentage.

Correlate FD saturation with SSTable count, compaction pending tasks, and thread pool blocked tasks on the same dashboard to distinguish compaction backlog from connection leaks.

Track client request latency and dropped message rates alongside FD metrics to detect the failure pattern before the node stops accepting connections.

Alert on FD usage percentage and on connection counts that grow while request throughput stays flat.