Config

Pigsty use declarative config to describe the env, infra & database clusters

Configuration

Pigsty uses declarative configuration.

Pigsty defines infra and database clusters through Inventory, and each Pigsty deploy has a corresponding config. Pigsty’s config uses the “Infra as Data” philosophy: Users describe requirements through the declarative config, and Pigsty adapts the fundamental components to the expected state.

Formally, the inventory can be implemented as a default local config file or as dynamic configuration data in CMDB. This article uses the default YAML configuration file pigsty.yml as an example. Pigsty detects the current node environment and generates the recommended config file in configure.

The main content of the inventory is config entries. Pigsty provides 220 parameters that can be configured at multiple levels, and most parameters can use default values. Config entry can be divided into four major categories according to category: INFRA， NODES/host nodes， PGSQL， REDIS, and further subdivided into 32 subcategories.

Configure

Go to the Pigsty project dir and execute configure. Pigsty will generate a config file based on the current machine env, a process called Configure.

./configure [-n|--non-interactive] [-d|--download] [-i|--ip <ipaddr>] [-m|--mode {auto|demo}]

configure will check the following things, minor problems will be fixed automatically, otherwise, it will prompt an error to exit.

check_kernel     # kernel        = Linux
check_machine    # machine       = x86_64
check_release    # release       = CentOS 7.x
check_sudo       # current_user  = NOPASSWD sudo
check_ssh        # current_user  = NOPASSWD ssh
check_ipaddr     # primary_ip (arg|probe|input)              (INTERACTIVE: ask for ip)
check_admin      # check current_user@primary_ip nopass ssh sudo
check_mode       # check machine spec to determine node mode (tiny|oltp|olap|crit)
check_config     # generate config according to primary_ip and mode
check_pkg        # check offline installation package exists (INTERACTIVE: ask for download)
check_repo       # create repo from pkg.tgz if exists
check_repo_file  # create local file repo file if repo exists
check_utils      # check ansible sshpass and other utils installed

Running . /configure directly will launch an interactive CLI wizard that prompts the user to answer the following 3 questions:

IP address

When multiple NICs with multiple IPs are detected on the current machine, the config wizard prompts you to enter the primary IP used, which is the IP you use to access the node from the internal network. Note that you should not use the public IP.

Download Package

When offline package /tmp/pkg.tgz not exists on the node, the config wizard will ask whether to download it from Github. Selecting Y will start the download, and selecting N will skip it. If your node has good Internet access with a suitable proxy config, or if you need to make offline packages, you can choose N.

Config Template

The config wizard automatically selects a config template based on the current machine env. However, you can specify the use of a config template manually with -m <mode>.

demo: The project’s default config file, the one used by the 4-node sandbox, enables all features.
auto: Suitable for deployment in production env with more stable and conservative configs.
In addition, Pigsty has several preconfigured config templates that can be specified and used directly with the -m, see the files/conf for details.

The most important part of the config template is to replace the placeholder IP 10.10.10.10 in the template with the real IP (intranet primary IP) of the current machine and select the appropriate database specification template according to the current machine config. You can use the default generated config file directly or make further customization based on the automatically generated config file.

Standard output of the configure

$ ./configure
configure pigsty v1.5.1 begin
[ OK ] kernel = Linux
[ OK ] machine = x86_64
[ OK ] release = 7.8.2003 , perfect
[ OK ] sudo = root ok
[ OK ] ssh = root@127.0.0.1 ok
[ OK ] primary_ip = 10.10.10.10  (from probe)
[ OK ] admin = root@10.10.10.10 ok
[ OK ] spec = mini (cpu = 2)
[ OK ] config = auto @ 10.10.10.10
[ OK ] cache = /tmp/pkg.tgz exists
[ OK ] repo = /www/pigsty ok
[ OK ] repo file = /etc/yum.repos.d/pigsty-local.repo
[ OK ] utils = install from local file repo
[ OK ] ansible = ansible 2.9.27
configure pigsty done. Use 'make install' to proceed

Config File

A specific sample config file is available at the root of the Pigsty project: pigsty.yml.

The top level of the config file is a single object with a key as all and contains two sub-projects: vars and children.

all:                      # Top-level object: all
  vars: <123 keys>        # Global Config: all.vars
  children:               # Grouping Definition: all.children Each project defines a cluster 
    meta: <2 keys>...     # Special grouping: meta  Defined environment meta nodes
    pg-meta: <2 keys>...  # Detailed definition of database cluster pg-meta
    pg-test: <2 keys>...  # Detailed definition of database cluster pg-test
    ...

The content of vars is a K-V pair that defines the global config parameters, K is the name of the config entry and V is the content.

The content of children is also a K-V pair, K is the cluster name and V is the specific cluster definition, a sample cluster definition is shown below:

The cluster definition also includes two sub-projects: vars defines the config at the cluster level. hosts define the cluster’s instance members.
The params in the cluster config override the global params, and the cluster configuration params are overridden by the configuration params of the same name at the instance level. The only mandatory cluster configuration parameter is pg_cluster, which is the name of the cluster and is consistent with the upper-level cluster name.
The hosts use K-V to define the cluster instance members, K is the IP (must be ssh reachable), and V is the specific instance config params.
There are two mandatory params in the instance config: pg_seq, and pg_role, which are the unique serial number of the instance and the role of the instance, respectively.

pg-test:                 # The cluster name is used as the cluster name by default
  vars:                  # Database cluster level variables
    pg_cluster: pg-test  # A mandatory config entry defined at the cluster level, consistent throughout pg-test. 
  hosts:                 # Database Cluster Members
    10.10.10.11: {pg_seq: 1, pg_role: primary} # Database Instance Members
    10.10.10.12: {pg_seq: 2, pg_role: replica} # The identity parameters pg_role and pg_seq must be defined
    10.10.10.13: {pg_seq: 3, pg_role: offline} # Variables at the instance level can be specified here

Pigsty config files follow Ansible rules in YAML format and use a single config file by default. The default config file path is pigsty.yml in the root dir of the Pigsty source code. The default config file is specified via inventory = pigsty.yml in ansible.cfg in the same dir. Additional config files can be specified via -i <config_path> when executing any playbook.

The config file needs to be used in conjunction with Ansible. Ansible is a popular DevOps tool. If you are proficient in Ansible, you can adapt the config file organization and structure according to Ansible’s manifest organization rules.

Please use the browse Ansible Quick Start and start executing playooks with Ansible.

Config Entry

Config entries are in the form of K-V pairs: the key is the name of the Config entry and the value is the content of the config entry.

Pigsty’s params can be configured at different levels and inherited and overwritten based on rules, with higher priority config entries overwriting lower priority config entries with the same name.

Config Entry Levels

In Pigsty’s config file, config entry can appear in three locations: global, cluster, and instance. Config entry defined in cluster vars override global config entry with same-name key override, and config entry defined in an instance, in turn, override cluster config entry with global config entry.

Granularity	Scope	Priority	Description	Location
Global	Global	Low	Consistent within the same set of deployment envs	`all.vars.xxx`
Cluster	Cluster	Medium	Consistency within the same set of clusters	`all.children.<cls>.vars.xxx`
Instance	Instance	High	The most granular level of config	`all.children.<cls>.hosts.<ins>.xxx`

Not all config entries are suitable for use at all levels. For example, infra params will usually only be defined in the global config, params such as database instance labels, roles, load balancing weights, and other params can only be configured at the instance level, and some operational options can only be provided using CLI params. For details of config entry, please see the list of config entry.

Default & Overwrite

In addition to the three config granularities, there are two extra levels of priority in the Pigsty config entry: default value and CLI param forced override:

Default: When a config entry does not appear at either the global/cluster/instance level, the default config entry is used. The default value has the lowest priority. The default params are defined in roles/<role>/default/main.yml.
Parameter: Config entry specified by means of CLI incoming params have the highest priority and will override all levels of config. Some config entries can only be specified by means of CLI params.

Levels	Priority	Source	Description	Location
Default	Lowest	Default	Default values for code logic definitions	`roles/<role>/default/main.yml`
Global	Low	Global	Consistent within the same set of deployment envs	`all.vars.xxx`
Cluster	Medium	Cluster	Consistency within the same set of clusters	`all.children.<cls>.vars.xxx`
Instance	High	Instance	The most granular level of config	`all.children.<cls>.hosts.<ins>.xxx`
Argument	Highest	Parameter	Pass in CLI arguments	`-e`

Config Category

Pigsty contains 220 fixed config entries divided into four sections: INFRA, NODES, PGSQL, and REDIS, for a total of 32 categories.

Usually, only the node/database identity parameter is mandatory, other params can be modified on demand using the default values.

Category	Section	Description	Count
INFRA	CONNECT	Connection parameters	1
INFRA	REPO	Local source infra	10
INFRA	CA	Public-Private Key Infra	5
INFRA	NGINX	Nginx Web Server	5
INFRA	NAMESERVER	DNS Server	1
INFRA	PROMETHEUS	Monitoring Time Series Database	7
INFRA	EXPORTER	Universal Exporter Config	3
INFRA	GRAFANA	Grafana Visualization Platform	9
INFRA	LOKI	Loki log collection platform	5
INFRA	DCS	Distributed Config Storage Meta DB	8
NODES	NODE_IDENTITY	Node identity parameters	5
NODES	NODE_DNS	Node Domain Name Resolution	5
NODES	NODE_PACKAGES	Node Packages	4
NODES	NODE_KERNEL_MODULES	Node Kernel Module	1
NODES	NODE_TUNE	Node parameter tuning	2
NODES	NODE_ADMIN	Node Admin User	6
NODES	NODE_TIME	Node time zone and time sync	4
NODES	NODE_EXPORTER	Node Indicator Exposer	3
NODES	PROMTAIL	Log collection component	5
PGSQL	PG_IDENTITY	PGSQL Identity Parameters	13
PGSQL	PG_BUSINESS	PGSQL Business Object Definition	11
PGSQL	PG_INSTALL	PGSQL Installation	11
PGSQL	PG_BOOTSTRAP	PGSQL Cluster Initialization	24
PGSQL	PG_PROVISION	PGSQL Cluster Provisioning	9
PGSQL	PG_EXPORTER	PGSQL Indicator Exposer	13
PGSQL	PG_SERVICE	PGSQL Service Access	16
REDIS	REDIS_IDENTITY	REDIS Identity Parameters	3
REDIS	REDIS_PROVISION	REDIS Cluster Provisioning	14
REDIS	REDIS_EXPORTER	REDIS Indicator Exposer	3

Last modified 2022-06-20: add timescaledb (3c335f4)