Migration

How to migrate existing postgres into Pigsty-managed cluster?

Pigsty has a built-in playbook pgsql-migration.yml to perform online database migration based on logical replication.

With proper automation, the downtime could be minimized to several seconds. But beware that logical replication requires PostgreSQL 10+ to work. You can still use the facility here and use a pg_dump | psql instead of logical replication.

Define a Migration Task

You have to create a migration task definition file to use this playbook.

Check files/migration/pg-meta.yml for example.

It will try to migrate the pg-meta.meta to pg-test.test.

  1. pg-meta-1 10.10.10.10 --> pg-test-1 10.10.10.11 (10.10.10.12,10.10.10.13)

You have to tell pigsty where is the source cluster and destination cluster. The database to be migrated, and the primary IP address.

You should have superuser privileges on both sides to proceed

You can overwrite the superuser connection to the source cluster with src_pg, and logical replication connection string with sub_conn, Otherwise, pigsty default admin & replicator credentials will be used.

  1. ---
  2. #-----------------------------------------------------------------
  3. # PG_MIGRATION
  4. #-----------------------------------------------------------------
  5. context_dir: ~/migration # migration manuals & scripts
  6. #-----------------------------------------------------------------
  7. # SRC Cluster (The OLD Cluster)
  8. #-----------------------------------------------------------------
  9. src_cls: pg-meta # src cluster name <REQUIRED>
  10. src_db: meta # src database name <REQUIRED>
  11. src_ip: 10.10.10.10 # src cluster primary ip <REQUIRED>
  12. #src_pg: '' # if defined, use this as src dbsu pgurl instead of:
  13. # # postgres://{{ pg_admin_username }}@{{ src_ip }}/{{ src_db }}
  14. # # e.g. 'postgres://dbuser_dba:DBUser.DBA@10.10.10.10:5432/meta'
  15. #sub_conn: '' # if defined, use this as subscription connstr instead of:
  16. # # host={{ src_ip }} dbname={{ src_db }} user={{ pg_replication_username }}'
  17. # # e.g. 'host=10.10.10.10 dbname=meta user=replicator password=DBUser.Replicator'
  18. #-----------------------------------------------------------------
  19. # DST Cluster (The New Cluster)
  20. #-----------------------------------------------------------------
  21. dst_cls: pg-test # dst cluster name <REQUIRED>
  22. dst_db: test # dst database name <REQUIRED>
  23. dst_ip: 10.10.10.11 # dst cluster primary ip <REQUIRED>
  24. #dst_pg: '' # if defined, use this as dst dbsu pgurl instead of:
  25. # # postgres://{{ pg_admin_username }}@{{ dst_ip }}/{{ dst_db }}
  26. # # e.g. 'postgres://dbuser_dba:DBUser.DBA@10.10.10.11:5432/test'
  27. #-----------------------------------------------------------------
  28. # PGSQL
  29. #-----------------------------------------------------------------
  30. pg_dbsu: postgres
  31. pg_replication_username: replicator
  32. pg_replication_password: DBUser.Replicator
  33. pg_admin_username: dbuser_dba
  34. pg_admin_password: DBUser.DBA
  35. pg_monitor_username: dbuser_monitor
  36. pg_monitor_password: DBUser.Monitor
  37. #-----------------------------------------------------------------
  38. ...

Generate Migration Plan

The playbook does not migrate src to dst, but it will generate everything your need to do so.

After the execution, you will find migration context dir under ~/migration/pg-meta.meta by default

Following the README.md and executing these scripts one by one, you will do the trick!

  1. # this script will setup migration context with env vars
  2. . ~/migration/pg-meta.meta/activate
  3. # these scripts are used for check src cluster status
  4. # and help generating new cluster definition in pigsty
  5. ./check-user # check src users
  6. ./check-db # check src databases
  7. ./check-hba # check src hba rules
  8. ./check-repl # check src replica identities
  9. ./check-misc # check src special objects
  10. # these scripts are used for building logical replication
  11. # between existing src cluster and pigsty managed dst cluster
  12. # schema, data will be synced in realtime, except for sequences
  13. ./copy-schema # copy schema to dest
  14. ./create-pub # create publication on src
  15. ./create-sub # create subscription on dst
  16. ./copy-progress # print logical replication progress
  17. ./copy-diff # quick src & dst diff by counting tables
  18. # these scripts will run in an online migration, which will
  19. # stop src cluster, copy sequence numbers (which is not synced with logical replication)
  20. # you have to reroute you app traffic according to your access method (dns,vip,haproxy,pgbouncer,etc...)
  21. # then perform cleanup to drop subscription and publication
  22. ./copy-seq [n] # sync sequence numbers, if n is given, an additional shift will applied
  23. #./disable-src # restrict src cluster access to admin node & new cluster (YOUR IMPLEMENTATION)
  24. #./re-routing # ROUTING APPLICATION TRAFFIC FROM SRC TO DST! (YOUR IMPLEMENTATION)
  25. ./drop-sub # drop subscription on dst after migration
  26. ./drop-pub # drop publication on src after migration

Caveats

You can use ./copy-seq 1000 to advance all sequences by a number (e.g. 1000) after syncing sequences. Which may prevent potential serial primary key conflict in new clusters.

You have to implement your own ./re-routing script to route your application traffic from src to dst. Since we don’t know how your traffic is routed (e.g dns, VIP, haproxy, or pgbouncer). Of course, you can always do that by hand…

You have to implement your own ./disable-src script to restrict the src cluster. You can do that by changing HBA rules & reload (recommended), or just shutting down postgres, pgbouncer, or haproxy…

Last modified 2023-02-27: refresh en docs to v2.0 (e82b371)