PostgreSQL major version upgrade: pg_upgrade, logical replication, and rollback plans
A major version upgrade is one of the highest-risk maintenance operations on a PostgreSQL cluster. Storage format, system catalogs, and the query planner all change between major versions, so you cannot simply restart the server with new binaries. Every major upgrade is a migration, even when it happens on the same host.
Most teams choose between two paths. pg_upgrade rewrites system catalogs while reusing or copying data files. It is fast but requires a downtime window. Logical replication streams row changes to a new cluster running the target version, enabling near-zero-downtime cutover but adding operational complexity. The wrong choice is usually the one made without understanding rollback boundaries.
This article explains the mechanics of both paths, when each is appropriate, and what irreversibility means for your rollback plan.
What a major version upgrade changes
Between major versions, PostgreSQL changes the on-disk representation of heap pages, the system catalog layout, and the WAL format. Physical streaming replication cannot cross major versions, and the data directory is not backward compatible. Minor version upgrades within the same major version are simple binary replacements; major version upgrades are not.
Because the physical format changes, pg_basebackup from an old primary cannot restore directly into a new major version, and a replica running PostgreSQL 16 cannot stream from a primary running PostgreSQL 15. Logical replication is the only built-in mechanism that crosses major version boundaries because it replicates row changes rather than physical pages.
pg_upgrade: in-place binary upgrade
pg_upgrade is a binary upgrade tool that creates new system catalogs and converts metadata to the target version’s format, then handles the actual table data via one of several transfer strategies. It always runs from the new version’s binary directory against the old version’s data directory. By default, pg_upgrade runs temporary postmaster processes on port 50432 to avoid unintended client connections during the upgrade.
Transfer modes
| Mode | Mechanism | Old cluster after upgrade | Rollback safety |
|---|---|---|---|
--copy (default) | Byte-for-byte copy of data files | Intact, can restart | Safe to revert |
--link | Hard-links data files into new directory | Unsafe to restart once new cluster writes | Requires backup restore |
--clone | Reflink/CoW copy (btrfs, XFS reflink, APFS) | Intact | Safe to revert |
--copy-file-range | ioctl(FICLONERANGE) clone on same filesystem | Intact | Safe to revert |
--swap (PG 18+) | Moves files and swaps catalogs | Unsafe once swap begins | Requires backup restore |
--link is the fastest option because it avoids copying multi-terabyte data files, but it comes with a critical operational constraint. Once the new cluster starts and writes to shared files, the old cluster is permanently unsafe to restart. Attempting to fall back to the old cluster after the new one has started with --link or --swap will corrupt data.
--clone offers similar speed on supported filesystems without destroying the old cluster’s independence. If your filesystem does not support reflinks, --copy-file-range is faster than --copy on SSDs but requires the old and new data directories to reside on the same filesystem.
Use --jobs (or -j) to parallelize the upgrade across databases and tablespaces. A sensible starting point is the number of CPU cores.
Always run pg_upgrade --check with the intended mode flag. A plain --check without --link, --clone, --copy-file-range, or --swap skips mode-specific verifications.
Upgrading standby servers
If the primary was upgraded with --link, do not run pg_upgrade on standby servers. Instead, use rsync from the primary to propagate the hard-linked files to each standby:
# Sync upgraded primary data to standby (run for each tablespace)
rsync --archive --delete --hard-links --size-only --no-inc-recursive \
old_cluster_dir new_cluster_dir standby_dir
If pg_wal is on a separate mount, include that directory in the rsync. Replication slots on the old primary before PostgreSQL 17.0 are not copied to new standbys and must be recreated manually.
Logical replication: the online migration path
Logical replication decodes WAL into row-level changes and streams them to a subscriber. Because it replicates logical changes rather than physical pages, it works across major versions. This enables a near-zero-downtime upgrade pattern:
- Provision the new cluster at the target PostgreSQL version.
- Apply schema migrations to the new cluster.
- Create a publication on the old primary and a subscription on the new cluster.
- Wait for replication lag to reach zero.
- Pause application traffic (for example, via PgBouncer
PAUSE). - Verify lag is still zero.
- Manually synchronize sequence values on the new cluster to exceed the old primary’s current values.
- Disable or drop the subscription on the new cluster.
- Redirect application traffic to the new cluster.
- Resume traffic.
Keep the old cluster running for 24 to 48 hours as an immediate rollback target.
Critical gotchas
Sequences are not replicated. Logical replication streams row data, not sequence states. If your application inserts using DEFAULT NEXTVAL(), the new cluster’s sequences must be manually advanced past the old primary’s values before cutover. Increment by a safety margin (for example, +1000) to account for in-flight inserts during the brief pause window.
Schema changes require manual application. DDL executed on the primary during setup is not propagated to the subscriber. Any schema changes must be applied to the new cluster before re-enabling replication or during a maintenance window.
Logical replication slot migration (PG 17+). Starting with PostgreSQL 17.0, pg_upgrade can migrate logical replication slots to the new cluster, provided the new cluster has wal_level = logical and sufficient max_replication_slots. This avoids manual slot recreation. For clusters older than PG 17.0, slots are silently ignored by pg_upgrade.
Upgrading logical replication clusters (PG 17+). If both publisher and subscriber run PostgreSQL 17.0 or later, you can upgrade them via pg_upgrade without rebuilding replication. Disable subscriptions before upgrading (ALTER SUBSCRIPTION ... DISABLE), upgrade the publisher first (or subscriber first, order is flexible), then re-enable and refresh the subscription.
Choosing between the two paths
Choose pg_upgrade when the cluster is small enough that the downtime window is acceptable, all data directories can reside on the same filesystem (enabling --link or --clone), the topology is simple, and you need the fastest possible upgrade path.
Choose logical replication when downtime must be minimized to seconds, the dataset is large enough that copying files would take hours, you need to validate the new cluster under production load before cutover, or the old cluster must remain available as a hot rollback target immediately after migration.
flowchart TD
A[Major version upgrade required] --> B{Downtime budget}
B -->|Minutes acceptable| C[pg_upgrade]
B -->|Near-zero required| D[Logical replication]
C --> E{Speed vs safety}
E -->|Fastest| F["--link or --swap
Old cluster unsafe"]
E -->|Safest| G["--copy or --clone
Old cluster intact"]
D --> H[Create publication
and subscription]
H --> I{Replication
lag zero?}
I -->|No| J[Wait and monitor]
J --> I
I -->|Yes| K[Pause traffic
Sync sequences
Cut over]
F --> L[Rollback requires
backup restore]
G --> M[Rollback to old cluster
if needed]
K --> N[Keep old cluster
24-48h as fallback]Rollback reality and irreversibility
Rollback capability depends entirely on the upgrade path.
With pg_upgrade --copy, --clone, or --copy-file-range, the old cluster remains intact. If the new cluster fails validation, stop it and restart the old cluster.
With pg_upgrade --link or --swap, the old cluster’s data files are permanently shared or modified once the new cluster starts. Rollback to the old cluster is impossible without restoring from backup. Always take a verified backup before using --link or --swap.
With logical replication, rollback is possible as long as the old cluster remains running and no writes have been accepted by the new cluster. Once the new cluster takes writes, rolling back requires re-replicating those writes to the old cluster or accepting data loss. Synchronize sequences carefully so the old cluster can resume if needed.
Post-upgrade verification
Extensions do not auto-upgrade during pg_upgrade. Shared object files for the target version must be installed on the server before running pg_upgrade. Do not run CREATE EXTENSION in the new cluster before pg_upgrade; schema definitions are duplicated from the old cluster. After the upgrade completes, apply extension SQL updates:
-- Check installed extensions
SELECT extname, extversion FROM pg_extension;
-- Apply available updates
ALTER EXTENSION extension_name UPDATE;
Regenerate statistics immediately
pg_upgrade transfers most optimizer statistics but excludes statistics created with CREATE STATISTICS, custom extension statistics, and cumulative statistics. Missing statistics cause severe query plan regressions.
Run the generated scripts in two stages:
# Stage 1: minimal stats for relations lacking any
vacuumdb --all --analyze-in-stages --missing-stats-only
# Stage 2: full cumulative statistics
PGOPTIONS='-c vacuum_cost_delay=0' vacuumdb --all --analyze-only
Use --jobs to accelerate both stages. In PostgreSQL 18+, vacuumdb --missing-stats-only analyzes only objects missing statistics (such as expression indexes), and planner statistics are better preserved during upgrades.
Breaking changes and reindexing
Review the target version’s release notes for breaking changes. PostgreSQL 17, for example, removed GUCs including old_snapshot_threshold, renamed pg_stat_statements timing columns (blk_read_time to shared_blk_read_time), and changed pg_attribute.attstattarget to use NULL instead of -1 for default targets.
If the cluster uses full-text search or pg_trgm indexes, pg_upgrade may recommend reindexing after the upgrade due to potential collation or encoding changes.
Signals to watch in production
| Signal | Why it matters | Warning sign |
|---|---|---|
pg_stat_replication.replay_lag | Logical replication must be caught up before cutover | Lag > 0 at migration time |
pg_stat_statements.mean_exec_time | Missing stats cause plan regressions | p99 latency spike post-upgrade |
pg_stat_user_tables.n_dead_tup | Vacuum behavior may shift after upgrade | Dead tuple ratio > 20% |
pg_stat_database.checksum_failures | Upgrade stress can expose latent corruption | Non-zero count on primary or replica |
pg_replication_slots.active | Slot health affects WAL retention and upgrade | Inactive slots after PG 17+ migration |
How Netdata helps
- Correlate query latency p99 before and after upgrade to detect plan regressions caused by missing statistics.
- Monitor logical replication lag in real time to determine the exact moment of cutover readiness.
- Alert on dead tuple ratio and autovacuum execution post-upgrade, as planner changes can alter vacuum effectiveness.
- Track connection counts and connection state distribution during the traffic cutover window.
- Watch checkpoint patterns and disk I/O saturation while
pg_upgradeor post-upgradevacuumdbruns.
Related guides
- How PostgreSQL actually works in production: a mental model for operators
- PostgreSQL ALTER TABLE blocked: zero-downtime DDL patterns
- PostgreSQL autovacuum blocked by long-running transaction: detection and fix
- PostgreSQL autovacuum not running: detection, causes, and fixes
- PostgreSQL autovacuum tuning: per-table thresholds for high-churn workloads
- PostgreSQL backup strategy: pg_dump, pg_basebackup, and pgBackRest compared
- PostgreSQL blocking queries: finding the root blocker in a lock cascade
- PostgreSQL checkpoint storms: detection, causes, and tuning
- PostgreSQL: checkpoints are occurring too frequently – what to tune
- PostgreSQL connection exhaustion: detection, diagnosis, and prevention
- PostgreSQL connection refused: pg_hba, listen_addresses, and TCP diagnosis
- PostgreSQL: database is not accepting commands to avoid wraparound data loss






