Configuration

Define roles & clusters for PostgreSQL

You can define different types of instances & clusters.

  • Identity: Parameters used for describing a PostgreSQL cluster
  • Primary: Define a single instance cluster.
  • Replica: Define a basic HA cluster with one primary & one replica.
  • Offline: Define a dedicated instance for OLAP/ETL/Interactive queries
  • Sync Standby: Enable synchronous commit to ensure no data loss.
  • Quorum Commit: Use quorum sync commit for an even higher consistency level.
  • Standby Cluster: Clone an existing cluster and follow it
  • Delayed Cluster: Clone an existing cluster for emergency data recovery
  • Citus Cluster: Define a Citus distributed database cluster

Primary

Let’s start with the simplest case, singleton meta:

  1. pg-test:
  2. hosts:
  3. 10.10.10.11: { pg_seq: 1, pg_role: primary }
  4. vars:
  5. pg_cluster: pg-test

Use the following command to create a primary database instance on the 10.10.10.11 node.

  1. bin/pgsql-add pg-test

Replica

To add a physical replica, you can assign a new instance to pg-test with pg_role set to replica

  1. pg-test:
  2. hosts:
  3. 10.10.10.11: { pg_seq: 1, pg_role: primary }
  4. 10.10.10.12: { pg_seq: 2, pg_role: replica } # <--- newly added
  5. vars:
  6. pg_cluster: pg-test

You can create an entire cluster or append a replica to the existing cluster:

  1. bin/pgsql-add pg-test # init entire cluster in one-pass
  2. bin/pgsql-add pg-test 10.10.10.12 # add replica to existing cluster

Offline

The offline instance is a dedicated replica to serve slow queries, ETL, OLAP traffic and interactive queries, etc…

To add an offline instance, assign a new instance with pg_role set to offline.

  1. pg-test:
  2. hosts:
  3. 10.10.10.11: { pg_seq: 1, pg_role: primary }
  4. 10.10.10.12: { pg_seq: 2, pg_role: replica }
  5. 10.10.10.13: { pg_seq: 2, pg_role: offline } # <--- newly added
  6. vars:
  7. pg_cluster: pg-test

Offline instance works like common replica instances, but it is used as a backup server in pg-test-replica service. That is to say, offline and primary instance serves only when all replica instances are down.

You can have ad hoc access control offline with pg_default_hba_rules and pg_hba_rules. It will apply to the offline instance and any instances with pg_offline_query flag.


Sync Standby

PostgreSQL uses asynchronous commit in stream replication by default. Which may have a small replication lag. (10KB / 10ms). A small window of data loss may occur when the primary fails (can be controlled with pg_rpo.), but it is acceptable for most scenarios.

But in some critical scenarios (e.g. financial transactions), data loss is totally unacceptable or read-your-write consistency is required. In this case, you can enable synchronous commit to ensure that.

To enable sync standby mode, you can simply use crit.yml template in pg_conf

  1. pg-test:
  2. hosts:
  3. 10.10.10.11: { pg_seq: 1, pg_role: primary }
  4. 10.10.10.12: { pg_seq: 2, pg_role: replica }
  5. 10.10.10.13: { pg_seq: 3, pg_role: replica }
  6. vars:
  7. pg_cluster: pg-test
  8. pg_conf: crit.yml # <--- use crit template

To enable sync standby on existing clusters, config the cluster and enable synchronous_mode:

  1. $ pg edit-config pg-test # run on admin node with admin user
  2. +++
  3. -synchronous_mode: false # <--- old value
  4. +synchronous_mode: true # <--- new value
  5. synchronous_mode_strict: false
  6. Apply these changes? [y/N]: y

Quorum Commit

When sync standby is enabled, PostgreSQL will pick one replica as the standby instance, and all other replicas as candidates. Primary will wait until the standby instance flushes to disk before a commit is confirmed, and the standby instance will always have the latest data without any lags.

However, you can achieve an even higher/lower consistency level with the quorum commit (trade-off with availability).

For example, to have any 2 replicas to confirm a commit:

  1. pg-test:
  2. hosts:
  3. 10.10.10.10: { pg_seq: 1, pg_role: primary } # <--- pg-test-1
  4. 10.10.10.11: { pg_seq: 2, pg_role: replica } # <--- pg-test-2
  5. 10.10.10.12: { pg_seq: 3, pg_role: replica } # <--- pg-test-3
  6. 10.10.10.13: { pg_seq: 4, pg_role: replica } # <--- pg-test-4
  7. vars:
  8. pg_cluster: pg-test
  9. pg_conf: crit.yml # <--- use crit template

Adjust synchronous_standby_names and synchronous_node_count accordingly:

  • synchronous_standby_names = ANY 2 (pg-test-2, pg-test-3, pg-test-4)
  • synchronous_node_count : 2

Example: Enable Quorum Commit

  1. $ pg edit-config pg-test
  2. ---
  3. +++
  4. @@ -82,10 +82,12 @@
  5. work_mem: 4MB
  6. + synchronous_standby_names: 'ANY 2 (pg-test-2, pg-test-3, pg-test-4)'
  7. -synchronous_mode: false
  8. +synchronous_mode: true
  9. +synchronous_node_count: 2
  10. synchronous_mode_strict: false
  11. Apply these changes? [y/N]: y

After the application, the configuration takes effect, and two Sync Standby appear. When the cluster has Failover or expansion and contraction, please adjust these parameters to avoid service unavailability.

  1. + Cluster: pg-test (7080814403632534854) +---------+----+-----------+-----------------+
  2. | Member | Host | Role | State | TL | Lag in MB | Tags |
  3. +-----------+-------------+--------------+---------+----+-----------+-----------------+
  4. | pg-test-1 | 10.10.10.10 | Leader | running | 1 | | clonefrom: true |
  5. | pg-test-2 | 10.10.10.11 | Sync Standby | running | 1 | 0 | clonefrom: true |
  6. | pg-test-3 | 10.10.10.12 | Sync Standby | running | 1 | 0 | clonefrom: true |
  7. | pg-test-4 | 10.10.10.13 | Replica | running | 1 | 0 | clonefrom: true |
  8. +-----------+-------------+--------------+---------+----+-----------+-----------------+

Standby Cluster

You can clone an existing cluster and create a standby cluster, which can be used for migration, horizontal split, multi-az deployment, or disaster recovery.

A standby cluster’s definition is just the same as any other normal cluster, except there’s a pg_upstream defined on the primary instance.

For example, you have a pg-test cluster, to create a standby cluster pg-test2, the inventory may look like this:

  1. # pg-test is the original cluster
  2. pg-test:
  3. hosts:
  4. 10.10.10.11: { pg_seq: 1, pg_role: primary }
  5. vars: { pg_cluster: pg-test }
  6. # pg-test2 is a standby cluster of pg-test.
  7. pg-test2:
  8. hosts:
  9. 10.10.10.12: { pg_seq: 1, pg_role: primary , pg_upstream: 10.10.10.11 } # <--- pg_upstream is defined here
  10. 10.10.10.13: { pg_seq: 2, pg_role: replica }
  11. vars: { pg_cluster: pg-test2 }

And pg-test2-1, the primary of pg-test2 will be a replica of pg-test and serve as a Standby Leader in pg-test2.

Just make sure that the pg_upstream parameter is configured on the primary of the backup cluster to pull backups from the original upstream automatically.

  1. bin/pgsql-add pg-test # Creating the original cluster
  2. bin/pgsql-add pg-test2 # Creating a Backup Cluster

Example: Change Replication Upstream

You can change the replication upstream of the standby cluster when necessary (e.g. upstream failover).

To do so, just change the standby_cluster.host to the new upstream IP address and apply.

  1. $ pg edit-config pg-test2
  2. standby_cluster:
  3. create_replica_methods:
  4. - basebackup
  5. - host: 10.10.10.13 # <--- The old upstream
  6. + host: 10.10.10.12 # <--- The new upstream
  7. port: 5432
  8. Apply these changes? [y/N]: y
  9. ``` Example: Promote Standby Cluster
  10. You can promote the standby cluster to a standalone cluster at any time.
  11. To do so, you have to [config]($79bb26d61ff640c0.md#config-cluster) the cluster and wipe the entire `standby_cluster` section then apply.

$ pg edit-config pg-test2 -standby_cluster:

  • create_replica_methods:
    • basebackup
  • host: 10.10.10.11
  • port: 5432

Apply these changes? [y/N]: y ``` Example: Cascade Replica

If the pg_upstream is specified for replica rather than primary, the replica will be configured as a cascade replica with the given upstream ip instead of the cluster primary

  1. pg-test:
  2. hosts: # pg-test-1 ---> pg-test-2 ---> pg-test-3
  3. 10.10.10.11: { pg_seq: 1, pg_role: primary }
  4. 10.10.10.12: { pg_seq: 2, pg_role: replica } # <--- bridge instance
  5. 10.10.10.13: { pg_seq: 2, pg_role: replica, pg_upstream: 10.10.10.12 }
  6. # ^--- replicate from pg-test-2 (the bridge) instead of pg-test-1 (the primary)
  7. vars: { pg_cluster: pg-test }

Delayed Cluster

A delayed cluster is a special type of standby cluster, which is used to recover “drop-by-accident” ASAP.

For example, if you wish to have a cluster pg-testdelay which has the same data as 1-day ago pg-test cluster:

  1. # pg-test is the original cluster
  2. pg-test:
  3. hosts:
  4. 10.10.10.11: { pg_seq: 1, pg_role: primary }
  5. vars: { pg_cluster: pg-test }
  6. # pg-testdelay is a delayed cluster of pg-test.
  7. pg-testdelay:
  8. hosts:
  9. 10.10.10.12: { pg_seq: 1, pg_role: primary , pg_upstream: 10.10.10.11, pg_delay: 1d }
  10. 10.10.10.13: { pg_seq: 2, pg_role: replica }
  11. vars: { pg_cluster: pg-test2 }

You can also configure a replication delay on the existing standby cluster.

  1. $ pg edit-config pg-testdelay
  2. standby_cluster:
  3. create_replica_methods:
  4. - basebackup
  5. host: 10.10.10.11
  6. port: 5432
  7. + recovery_min_apply_delay: 1h # <--- add delay here
  8. Apply these changes? [y/N]: y

When some tuples & tables are dropped by accident, you can advance this delayed cluster to a proper time point and select data from it.

It takes more resources, but can be much faster and have less impact than PITR


Citus Cluster

Pigsty has native citus support. Check files/pigsty/citus.yml for example.

To define a citus cluster,

Besides, extra hba rules that allow ssl access from local & other data nodes are required. Which may looks like this

  1. all:
  2. children:
  3. pg-citus0: # citus coordinator, pg_group = 0
  4. hosts: { 10.10.10.10: { pg_seq: 1, pg_role: primary } }
  5. vars: { pg_cluster: pg-citus0 , pg_group: 0 }
  6. pg-citus1: # citus data node 1
  7. hosts: { 10.10.10.11: { pg_seq: 1, pg_role: primary } }
  8. vars: { pg_cluster: pg-citus1 , pg_group: 1 }
  9. pg-citus2: # citus data node 2
  10. hosts: { 10.10.10.12: { pg_seq: 1, pg_role: primary } }
  11. vars: { pg_cluster: pg-citus2 , pg_group: 2 }
  12. pg-citus3: # citus data node 3, with an extra replica
  13. hosts:
  14. 10.10.10.13: { pg_seq: 1, pg_role: primary }
  15. 10.10.10.14: { pg_seq: 2, pg_role: replica }
  16. vars: { pg_cluster: pg-citus3 , pg_group: 3 }
  17. vars: # global parameters for all citus clusters
  18. pg_mode: citus # pgsql cluster mode: citus
  19. pg_shard: pg-citus # citus shard name: pg-citus
  20. patroni_citus_db: meta # citus distributed database name
  21. pg_dbsu_password: DBUser.Postgres # all dbsu password access for citus cluster
  22. pg_users: [ { name: dbuser_meta ,password: DBUser.Meta ,pgbouncer: true ,roles: [ dbrole_admin ] } ]
  23. pg_databases: [ { name: meta ,extensions: [ { name: citus }, { name: postgis }, { name: timescaledb } ] } ]
  24. pg_hba_rules:
  25. - { user: 'all' ,db: all ,addr: 127.0.0.1/32 ,auth: ssl ,title: 'all user ssl access from localhost' }
  26. - { user: 'all' ,db: all ,addr: intra ,auth: ssl ,title: 'all user ssl access from intranet' }

And you can create distributed table & reference table on the coordinator node, and query them from any data node.

  1. SELECT create_distributed_table('pgbench_accounts', 'aid'); SELECT truncate_local_data_after_distributing_table($$public.pgbench_accounts$$);
  2. SELECT create_reference_table('pgbench_branches') ; SELECT truncate_local_data_after_distributing_table($$public.pgbench_branches$$);
  3. SELECT create_reference_table('pgbench_history') ; SELECT truncate_local_data_after_distributing_table($$public.pgbench_history$$);
  4. SELECT create_reference_table('pgbench_tellers') ; SELECT truncate_local_data_after_distributing_table($$public.pgbench_tellers$$);

Last modified 2023-02-27: refresh en docs to v2.0 (e82b371)