PostgreSQL fsync=off: why disabling it ruins durability
You are reviewing a PostgreSQL instance that crashed during a kernel update and now fails to start with checksum or WAL errors. Or you are benchmarking write throughput and a guide suggests turning off fsync to remove the “fsync bottleneck.” Maybe you are auditing configuration drift and found fsync = off in a postgresql.conf that was copied from an old developer environment.
PostgreSQL’s default is fsync = on in all supported versions. Disabling it can make bulk loads and heavy write workloads appear dramatically faster because the database stops waiting for the operating system to flush write-ahead log (WAL) records to durable storage. The cost is that an operating system crash or power loss can leave the data directory in an unrecoverable state. WAL is the source of truth; fsync = off means the database trusts the OS page cache with that truth. When that trust is broken by a power event, the resulting corruption may not be detected until you attempt to start the server or query a specific table.
What it is and why it matters
fsync controls whether PostgreSQL forces WAL and data file modifications to durable storage using system calls such as fsync() or fdatasync(). When fsync = on, PostgreSQL issues these calls according to the wal_sync_method parameter. On Linux, wal_sync_method defaults to fdatasync.
The guarantee is simple: when a transaction commits, its WAL records survive a power loss.
When fsync = off, PostgreSQL still writes WAL records to kernel buffers, but it never forces them to disk. The database assumes the OS will eventually flush the data, and it assumes that any crash will be recoverable. That assumption is wrong for OS-level crashes. The OS page cache is a volatile staging area. If the OS panics or the host loses power before the kernel page cache is written to disk, committed transactions evaporate. Worse, because data files may have been written without the corresponding WAL, the on-disk state can become internally inconsistent. The difference between losing a few seconds of transactions and losing the entire database is the difference between a recoverable incident and a restore-from-backup emergency.
How it works
Under normal operation, the commit path looks like this. A transaction generates WAL records in shared memory. At commit, the WAL writer (or the committing backend) writes those records to the current WAL file and issues a sync method call. Only after the sync succeeds does PostgreSQL acknowledge the commit to the client. If the server crashes after the commit returns, crash recovery replays WAL from the last checkpoint and brings the database to a consistent state.
With fsync = off, the sync call is skipped. WAL records may sit in the OS page cache for seconds or minutes. Data file modifications from checkpoints or background writer activity are also not forced to disk. Checkpoints lose their meaning as a recovery boundary because the writes they issued may never have reached the physical medium. The database appears to run normally because reads and writes succeed from the perspective of the PostgreSQL process. The corruption is only revealed when the OS cache is lost.
flowchart TD
A[COMMIT received] --> B{fsync setting}
B -->|on| C[wal_sync_method flushes WAL]
C --> D[WAL on durable storage]
D --> E[Acknowledge to client]
B -->|off| F[WAL written to OS page cache]
F --> E
F --> G[Power loss or OS crash]
G --> H[Unflushed WAL lost]
H --> I[Corruption or silent data loss]Because WAL is the source of truth and data files are treated as a cache, losing WAL means the data files cannot be reconciled. The checkpoint mechanism, crash recovery, and point-in-time recovery all depend on WAL being present and complete. fsync = off breaks that dependency.
There is a secondary interaction with full_page_writes. When fsync = off, partial page writes caused by an OS crash cannot be recovered using WAL full-page images, because the WAL itself may not have been flushed. The PostgreSQL documentation suggests that if you disable fsync, you might also want to disable full_page_writes. In practice, this means you are disabling two independent safety mechanisms at once.
Another related parameter is commit_delay, which batches commits to amortize the cost of a WAL flush. When fsync = off, there is almost no flush cost to amortize, so commit_delay provides no benefit.
Where it shows up in production
The most common production encounter with fsync = off is not a deliberate choice. It is a configuration that was copied from a benchmark environment, a developer workstation, or an old tuning guide and never reverted before going live.
Benchmarking and load testing. Disabling fsync is a legitimate way to measure CPU-bound query throughput without storage latency. The error is leaving the setting in place when the test ends.
Developer environments and containers. PostgreSQL configuration files from development environments sometimes disable fsync to reduce disk wear or speed up test suites. When these configurations are promoted to staging or production via configuration management templates, the setting comes along for the ride.
Misunderstanding safer alternatives. Teams looking to reduce commit latency sometimes conflate fsync = off with synchronous_commit = off. The former disables storage sync entirely. The latter simply returns the commit acknowledgment before the WAL flush completes, while still allowing the background WAL writer to flush to disk. They are not equivalent.
Storage migration or hardware testing. During storage benchmarking or migration validation, operators may disable fsync to isolate throughput numbers. If the instance is later promoted to a production role without reverting the change, the durability guarantee is gone.
Tradeoffs and common misuses
fsync = off is not a performance tuning knob for production OLTP. It is a durability bypass switch.
fsync = off versus synchronous_commit = off. This distinction is critical. synchronous_commit = off allows the database to return a commit acknowledgment before WAL is flushed, but PostgreSQL still writes WAL and still flushes it to disk via the walwriter process. A crash may lose transactions that committed in the last wal_writer_delay window, but the on-disk database remains logically consistent. Those transactions simply never happened, as far as recovery is concerned. This makes synchronous_commit = off an acceptable choice for non-critical batch loads or telemetry ingestion where losing a few seconds of data is acceptable. With fsync = off, the WAL itself may be incomplete or missing, so the database cannot recover to a consistent state at all.
fsync = off versus full_page_writes = off. Disabling full_page_writes reduces WAL volume by omitting full page images, but it still requires fsync to be on to protect against partial page writes during an OS crash. Disabling both removes both the page-image safety net and the WAL durability guarantee.
The data_sync_retry defense. By default, data_sync_retry is off. When it is off, any failure to flush modified data files causes PostgreSQL to emit a PANIC-level error and crash the entire instance. This behavior was introduced in response to kernel-level fsync bugs, where retrying fsync after a failure could falsely report success while data was lost. The PANIC forces human intervention rather than risking silent corruption. This parameter does not make fsync = off safe, but it illustrates how seriously PostgreSQL treats storage sync failures.
Signals to watch in production
| Signal | Why it matters | Warning sign |
|---|---|---|
fsync setting in pg_settings | Determines whether WAL is forced to durable storage | fsync = off in any production or staging instance |
Confusion with synchronous_commit | Teams often disable the wrong parameter when chasing latency | fsync = off set alongside comments about “group commit” or “commit latency” |
commit_delay having no effect | When fsync = off, there is almost no WAL flush cost to amortize | Tuning commit_delay produces zero throughput change |
| PANIC logs after storage errors | data_sync_retry = off (default) crashes the instance on flush failure | Any PANIC that follows a storage sync or file flush failure |
| Corruption on startup after a crash | Without fsync, OS crashes leave the data directory inconsistent | Errors about invalid WAL records, missing pages, or checksum failures following a reboot |
full_page_writes = on with fsync = off | Full page images cannot protect against torn pages if WAL itself is not durable | Misconfiguration that disables sync while keeping page images enabled |
How Netdata helps
Netdata collects PostgreSQL parameters and runtime metrics that surface configuration drift without manual pg_settings inspection.
- Configuration audit. Netdata collects PostgreSQL parameters, making it easy to spot
fsync = offorfull_page_writes = offduring routine fleet reviews. - WAL and checkpoint correlation. By monitoring WAL generation rates alongside disk flush latency, you can identify whether commit latency is actually bounded by storage sync or by other factors before considering any durability tradeoff.
- Crash and PANIC detection. Netdata monitors PostgreSQL log severity. A spike in PANIC messages related to
data_sync_retryor storage sync failures surfaces immediately. - Commit latency baselines. Tracking transaction commit times helps you measure the real impact of
synchronous_commitadjustments, which is the safer first lever to pull when write latency is too high.
Related guides
- How PostgreSQL actually works in production: a mental model for operators
- PostgreSQL ALTER TABLE blocked: zero-downtime DDL patterns
- PostgreSQL autovacuum blocked by long-running transaction: detection and fix
- PostgreSQL autovacuum not running: detection, causes, and fixes
- PostgreSQL autovacuum tuning: per-table thresholds for high-churn workloads
- PostgreSQL blocking queries: finding the root blocker in a lock cascade
- PostgreSQL checkpoint storms: detection, causes, and tuning
- PostgreSQL: checkpoints are occurring too frequently – what to tune
- PostgreSQL connection exhaustion: detection, diagnosis, and prevention
- PostgreSQL connection refused: pg_hba, listen_addresses, and TCP diagnosis
- PostgreSQL: database is not accepting commands to avoid wraparound data loss
- PostgreSQL dead tuples piling up: why autovacuum can’t keep up






