PostgreSQL major version upgrade: pg_upgrade, logical replication, and rollback plans

A major version upgrade is one of the highest-risk maintenance operations on a PostgreSQL cluster. Storage format, system catalogs, and the query planner all change between major versions, so you cannot simply restart the server with new binaries. Every major upgrade is a migration, even when it happens on the same host.

Most teams choose between two paths. pg_upgrade rewrites system catalogs while reusing or copying data files. It is fast but requires a downtime window. Logical replication streams row changes to a new cluster running the target version, enabling near-zero-downtime cutover but adding operational complexity. The wrong choice is usually the one made without understanding rollback boundaries.

This article explains the mechanics of both paths, when each is appropriate, and what irreversibility means for your rollback plan.

What a major version upgrade changes

Between major versions, PostgreSQL changes the on-disk representation of heap pages, the system catalog layout, and the WAL format. Physical streaming replication cannot cross major versions, and the data directory is not backward compatible. Minor version upgrades within the same major version are simple binary replacements; major version upgrades are not.

Because the physical format changes, pg_basebackup from an old primary cannot restore directly into a new major version, and a replica running PostgreSQL 16 cannot stream from a primary running PostgreSQL 15. Logical replication is the only built-in mechanism that crosses major version boundaries because it replicates row changes rather than physical pages.

pg_upgrade: in-place binary upgrade

pg_upgrade is a binary upgrade tool that creates new system catalogs and converts metadata to the target version’s format, then handles the actual table data via one of several transfer strategies. It always runs from the new version’s binary directory against the old version’s data directory. By default, pg_upgrade runs temporary postmaster processes on port 50432 to avoid unintended client connections during the upgrade.

Transfer modes

ModeMechanismOld cluster after upgradeRollback safety
--copy (default)Byte-for-byte copy of data filesIntact, can restartSafe to revert
--linkHard-links data files into new directoryUnsafe to restart once new cluster writesRequires backup restore
--cloneReflink/CoW copy (btrfs, XFS reflink, APFS)IntactSafe to revert
--copy-file-rangeioctl(FICLONERANGE) clone on same filesystemIntactSafe to revert
--swap (PG 18+)Moves files and swaps catalogsUnsafe once swap beginsRequires backup restore

--link is the fastest option because it avoids copying multi-terabyte data files, but it comes with a critical operational constraint. Once the new cluster starts and writes to shared files, the old cluster is permanently unsafe to restart. Attempting to fall back to the old cluster after the new one has started with --link or --swap will corrupt data.

--clone offers similar speed on supported filesystems without destroying the old cluster’s independence. If your filesystem does not support reflinks, --copy-file-range is faster than --copy on SSDs but requires the old and new data directories to reside on the same filesystem.

Use --jobs (or -j) to parallelize the upgrade across databases and tablespaces. A sensible starting point is the number of CPU cores.

Always run pg_upgrade --check with the intended mode flag. A plain --check without --link, --clone, --copy-file-range, or --swap skips mode-specific verifications.

Upgrading standby servers

If the primary was upgraded with --link, do not run pg_upgrade on standby servers. Instead, use rsync from the primary to propagate the hard-linked files to each standby:

# Sync upgraded primary data to standby (run for each tablespace)
rsync --archive --delete --hard-links --size-only --no-inc-recursive \
  old_cluster_dir new_cluster_dir standby_dir

If pg_wal is on a separate mount, include that directory in the rsync. Replication slots on the old primary before PostgreSQL 17.0 are not copied to new standbys and must be recreated manually.

Logical replication: the online migration path

Logical replication decodes WAL into row-level changes and streams them to a subscriber. Because it replicates logical changes rather than physical pages, it works across major versions. This enables a near-zero-downtime upgrade pattern:

  1. Provision the new cluster at the target PostgreSQL version.
  2. Apply schema migrations to the new cluster.
  3. Create a publication on the old primary and a subscription on the new cluster.
  4. Wait for replication lag to reach zero.
  5. Pause application traffic (for example, via PgBouncer PAUSE).
  6. Verify lag is still zero.
  7. Manually synchronize sequence values on the new cluster to exceed the old primary’s current values.
  8. Disable or drop the subscription on the new cluster.
  9. Redirect application traffic to the new cluster.
  10. Resume traffic.

Keep the old cluster running for 24 to 48 hours as an immediate rollback target.

Critical gotchas

Sequences are not replicated. Logical replication streams row data, not sequence states. If your application inserts using DEFAULT NEXTVAL(), the new cluster’s sequences must be manually advanced past the old primary’s values before cutover. Increment by a safety margin (for example, +1000) to account for in-flight inserts during the brief pause window.

Schema changes require manual application. DDL executed on the primary during setup is not propagated to the subscriber. Any schema changes must be applied to the new cluster before re-enabling replication or during a maintenance window.

Logical replication slot migration (PG 17+). Starting with PostgreSQL 17.0, pg_upgrade can migrate logical replication slots to the new cluster, provided the new cluster has wal_level = logical and sufficient max_replication_slots. This avoids manual slot recreation. For clusters older than PG 17.0, slots are silently ignored by pg_upgrade.

Upgrading logical replication clusters (PG 17+). If both publisher and subscriber run PostgreSQL 17.0 or later, you can upgrade them via pg_upgrade without rebuilding replication. Disable subscriptions before upgrading (ALTER SUBSCRIPTION ... DISABLE), upgrade the publisher first (or subscriber first, order is flexible), then re-enable and refresh the subscription.

Choosing between the two paths

Choose pg_upgrade when the cluster is small enough that the downtime window is acceptable, all data directories can reside on the same filesystem (enabling --link or --clone), the topology is simple, and you need the fastest possible upgrade path.

Choose logical replication when downtime must be minimized to seconds, the dataset is large enough that copying files would take hours, you need to validate the new cluster under production load before cutover, or the old cluster must remain available as a hot rollback target immediately after migration.

flowchart TD
    A[Major version upgrade required] --> B{Downtime budget}
    B -->|Minutes acceptable| C[pg_upgrade]
    B -->|Near-zero required| D[Logical replication]
    C --> E{Speed vs safety}
    E -->|Fastest| F["--link or --swap
Old cluster unsafe"] E -->|Safest| G["--copy or --clone
Old cluster intact"] D --> H[Create publication
and subscription] H --> I{Replication
lag zero?} I -->|No| J[Wait and monitor] J --> I I -->|Yes| K[Pause traffic
Sync sequences
Cut over] F --> L[Rollback requires
backup restore] G --> M[Rollback to old cluster
if needed] K --> N[Keep old cluster
24-48h as fallback]

Rollback reality and irreversibility

Rollback capability depends entirely on the upgrade path.

With pg_upgrade --copy, --clone, or --copy-file-range, the old cluster remains intact. If the new cluster fails validation, stop it and restart the old cluster.

With pg_upgrade --link or --swap, the old cluster’s data files are permanently shared or modified once the new cluster starts. Rollback to the old cluster is impossible without restoring from backup. Always take a verified backup before using --link or --swap.

With logical replication, rollback is possible as long as the old cluster remains running and no writes have been accepted by the new cluster. Once the new cluster takes writes, rolling back requires re-replicating those writes to the old cluster or accepting data loss. Synchronize sequences carefully so the old cluster can resume if needed.

Post-upgrade verification

Extensions do not auto-upgrade during pg_upgrade. Shared object files for the target version must be installed on the server before running pg_upgrade. Do not run CREATE EXTENSION in the new cluster before pg_upgrade; schema definitions are duplicated from the old cluster. After the upgrade completes, apply extension SQL updates:

-- Check installed extensions
SELECT extname, extversion FROM pg_extension;

-- Apply available updates
ALTER EXTENSION extension_name UPDATE;

Regenerate statistics immediately

pg_upgrade transfers most optimizer statistics but excludes statistics created with CREATE STATISTICS, custom extension statistics, and cumulative statistics. Missing statistics cause severe query plan regressions.

Run the generated scripts in two stages:

# Stage 1: minimal stats for relations lacking any
vacuumdb --all --analyze-in-stages --missing-stats-only

# Stage 2: full cumulative statistics
PGOPTIONS='-c vacuum_cost_delay=0' vacuumdb --all --analyze-only

Use --jobs to accelerate both stages. In PostgreSQL 18+, vacuumdb --missing-stats-only analyzes only objects missing statistics (such as expression indexes), and planner statistics are better preserved during upgrades.

Breaking changes and reindexing

Review the target version’s release notes for breaking changes. PostgreSQL 17, for example, removed GUCs including old_snapshot_threshold, renamed pg_stat_statements timing columns (blk_read_time to shared_blk_read_time), and changed pg_attribute.attstattarget to use NULL instead of -1 for default targets.

If the cluster uses full-text search or pg_trgm indexes, pg_upgrade may recommend reindexing after the upgrade due to potential collation or encoding changes.

Signals to watch in production

SignalWhy it mattersWarning sign
pg_stat_replication.replay_lagLogical replication must be caught up before cutoverLag > 0 at migration time
pg_stat_statements.mean_exec_timeMissing stats cause plan regressionsp99 latency spike post-upgrade
pg_stat_user_tables.n_dead_tupVacuum behavior may shift after upgradeDead tuple ratio > 20%
pg_stat_database.checksum_failuresUpgrade stress can expose latent corruptionNon-zero count on primary or replica
pg_replication_slots.activeSlot health affects WAL retention and upgradeInactive slots after PG 17+ migration

How Netdata helps

  • Correlate query latency p99 before and after upgrade to detect plan regressions caused by missing statistics.
  • Monitor logical replication lag in real time to determine the exact moment of cutover readiness.
  • Alert on dead tuple ratio and autovacuum execution post-upgrade, as planner changes can alter vacuum effectiveness.
  • Track connection counts and connection state distribution during the traffic cutover window.
  • Watch checkpoint patterns and disk I/O saturation while pg_upgrade or post-upgrade vacuumdb runs.