基础设施即代码

基础设施即代码,Infra as Code,简称 IaC。Pigsty 使用 IaC 的方式管理整套部署环境中的所有组件。

基础设施即代码,Infra as Code,简称 IaC。

Pigsty 遵循 IaC 与 GitOPS 的理念:Pigsty 的部署由声明式的 配置清单 描述,并通过 幂等剧本 来实现。

用户用声明的方式通过 配置参数 来描述自己期望的状态,而剧本则以幂等的方式调整目标节点以达到这个状态。这类似于 Kubernetes 的 CRD & Operator,Pigsty 在裸机和虚拟机上实现了同样的功能。


声明模块

以下面的默认配置片段为例,这段配置描述了一个节点 10.10.10.10,其上安装了 INFRANODEETCDPGSQL 模块。

# 监控、告警、DNS、NTP 等基础设施集群... infra: { hosts: { 10.10.10.10: { infra_seq: 1 } } } # minio 集群,兼容 s3 的对象存储 minio: { hosts: { 10.10.10.10: { minio_seq: 1 } }, vars: { minio_cluster: minio } } # etcd 集群,用作 PostgreSQL 高可用所需的 DCS etcd: { hosts: { 10.10.10.10: { etcd_seq: 1 } }, vars: { etcd_cluster: etcd } } # PGSQL 示例集群: pg-meta pg-meta: { hosts: { 10.10.10.10: { pg_seq: 1, pg_role: primary }, vars: { pg_cluster: pg-meta } }

要真正安装这些模块,执行以下剧本:

./infra.yml -l 10.10.10.10 # 在节点 10.10.10.10 上初始化 infra 模块 ./etcd.yml -l 10.10.10.10 # 在节点 10.10.10.10 上初始化 etcd 模块 ./minio.yml -l 10.10.10.10 # 在节点 10.10.10.10 上初始化 minio 模块 ./pgsql.yml -l 10.10.10.10 # 在节点 10.10.10.10 上初始化 pgsql 模块

声明集群

您可以声明 PostgreSQL 数据库集群,在多个节点上安装 PGSQL 模块,并使其成为一个服务单元:

例如,要在以下三个已被 Pigsty 纳管的节点上,部署一个使用流复制组建的三节点高可用 PostgreSQL 集群, 您可以在配置文件 pigsty.ymlall.children 中添加以下定义:

pg-test: hosts: 10.10.10.11: { pg_seq: 1, pg_role: primary } 10.10.10.12: { pg_seq: 2, pg_role: replica } 10.10.10.13: { pg_seq: 3, pg_role: offline } vars: { pg_cluster: pg-test }

定义完后,可以使用 剧本 将集群创建:

bin/pgsql-add pg-test # 创建 pg-test 集群

pigsty-iac.jpg

你可以使用不同的的实例角色,例如 主库(primary),从库(replica),离线从库(offline),延迟从库(delayed),同步备库(sync standby); 以及不同的集群:例如 备份集群(Standby Cluster),Citus集群,甚至是 Redis / MinIO / Etcd 集群


定制集群内容

您不仅可以使用声明式的方式定义集群,还可以定义集群中的数据库、用户、服务、HBA 规则等内容,例如,下面的配置文件对默认的 pg-meta 单节点数据库集群的内容进行了深度定制:

包括:声明了六个业务数据库与七个业务用户,添加了一个额外的 standby 服务(同步备库,提供无复制延迟的读取能力),定义了一些额外的 pg_hba 规则,一个指向集群主库的 L2 VIP 地址,与自定义的备份策略。

pg-meta: hosts: { 10.10.10.10: { pg_seq: 1, pg_role: primary , pg_offline_query: true } } vars: pg_cluster: pg-meta pg_databases: # define business databases on this cluster, array of database definition - name: meta # REQUIRED, `name` is the only mandatory field of a database definition baseline: cmdb.sql # optional, database sql baseline path, (relative path among ansible search path, e.g files/) pgbouncer: true # optional, add this database to pgbouncer database list? true by default schemas: [pigsty] # optional, additional schemas to be created, array of schema names extensions: # optional, additional extensions to be installed: array of `{name[,schema]}` - { name: postgis , schema: public } - { name: timescaledb } comment: pigsty meta database # optional, comment string for this database owner: postgres # optional, database owner, postgres by default template: template1 # optional, which template to use, template1 by default encoding: UTF8 # optional, database encoding, UTF8 by default. (MUST same as template database) locale: C # optional, database locale, C by default. (MUST same as template database) lc_collate: C # optional, database collate, C by default. (MUST same as template database) lc_ctype: C # optional, database ctype, C by default. (MUST same as template database) tablespace: pg_default # optional, default tablespace, 'pg_default' by default. allowconn: true # optional, allow connection, true by default. false will disable connect at all revokeconn: false # optional, revoke public connection privilege. false by default. (leave connect with grant option to owner) register_datasource: true # optional, register this database to grafana datasources? true by default connlimit: -1 # optional, database connection limit, default -1 disable limit pool_auth_user: dbuser_meta # optional, all connection to this pgbouncer database will be authenticated by this user pool_mode: transaction # optional, pgbouncer pool mode at database level, default transaction pool_size: 64 # optional, pgbouncer pool size at database level, default 64 pool_size_reserve: 32 # optional, pgbouncer pool size reserve at database level, default 32 pool_size_min: 0 # optional, pgbouncer pool size min at database level, default 0 pool_max_db_conn: 100 # optional, max database connections at database level, default 100 - { name: grafana ,owner: dbuser_grafana ,revokeconn: true ,comment: grafana primary database } - { name: bytebase ,owner: dbuser_bytebase ,revokeconn: true ,comment: bytebase primary database } - { name: kong ,owner: dbuser_kong ,revokeconn: true ,comment: kong the api gateway database } - { name: gitea ,owner: dbuser_gitea ,revokeconn: true ,comment: gitea meta database } - { name: wiki ,owner: dbuser_wiki ,revokeconn: true ,comment: wiki meta database } pg_users: # define business users/roles on this cluster, array of user definition - name: dbuser_meta # REQUIRED, `name` is the only mandatory field of a user definition password: DBUser.Meta # optional, password, can be a scram-sha-256 hash string or plain text login: true # optional, can log in, true by default (new biz ROLE should be false) superuser: false # optional, is superuser? false by default createdb: false # optional, can create database? false by default createrole: false # optional, can create role? false by default inherit: true # optional, can this role use inherited privileges? true by default replication: false # optional, can this role do replication? false by default bypassrls: false # optional, can this role bypass row level security? false by default pgbouncer: true # optional, add this user to pgbouncer user-list? false by default (production user should be true explicitly) connlimit: -1 # optional, user connection limit, default -1 disable limit expire_in: 3650 # optional, now + n days when this role is expired (OVERWRITE expire_at) expire_at: '2030-12-31' # optional, YYYY-MM-DD 'timestamp' when this role is expired (OVERWRITTEN by expire_in) comment: pigsty admin user # optional, comment string for this user/role roles: [dbrole_admin] # optional, belonged roles. default roles are: dbrole_{admin,readonly,readwrite,offline} parameters: {} # optional, role level parameters with `ALTER ROLE SET` pool_mode: transaction # optional, pgbouncer pool mode at user level, transaction by default pool_connlimit: -1 # optional, max database connections at user level, default -1 disable limit - {name: dbuser_view ,password: DBUser.Viewer ,pgbouncer: true ,roles: [dbrole_readonly], comment: read-only viewer for meta database} - {name: dbuser_grafana ,password: DBUser.Grafana ,pgbouncer: true ,roles: [dbrole_admin] ,comment: admin user for grafana database } - {name: dbuser_bytebase ,password: DBUser.Bytebase ,pgbouncer: true ,roles: [dbrole_admin] ,comment: admin user for bytebase database } - {name: dbuser_kong ,password: DBUser.Kong ,pgbouncer: true ,roles: [dbrole_admin] ,comment: admin user for kong api gateway } - {name: dbuser_gitea ,password: DBUser.Gitea ,pgbouncer: true ,roles: [dbrole_admin] ,comment: admin user for gitea service } - {name: dbuser_wiki ,password: DBUser.Wiki ,pgbouncer: true ,roles: [dbrole_admin] ,comment: admin user for wiki.js service } pg_services: # extra services in addition to pg_default_services, array of service definition # standby service will route {ip|name}:5435 to sync replica's pgbouncer (5435->6432 standby) - name: standby # required, service name, the actual svc name will be prefixed with `pg_cluster`, e.g: pg-meta-standby port: 5435 # required, service exposed port (work as kubernetes service node port mode) ip: "*" # optional, service bind ip address, `*` for all ip by default selector: "[]" # required, service member selector, use JMESPath to filter inventory dest: default # optional, destination port, default|postgres|pgbouncer|<port_number>, 'default' by default check: /sync # optional, health check url path, / by default backup: "[? pg_role == `primary`]" # backup server selector maxconn: 3000 # optional, max allowed front-end connection balance: roundrobin # optional, haproxy load balance algorithm (roundrobin by default, other: leastconn) options: 'inter 3s fastinter 1s downinter 5s rise 3 fall 3 on-marked-down shutdown-sessions slowstart 30s maxconn 3000 maxqueue 128 weight 100' pg_hba_rules: - {user: dbuser_view , db: all ,addr: infra ,auth: pwd ,title: 'allow grafana dashboard access cmdb from infra nodes'} pg_vip_enabled: true pg_vip_address: 10.10.10.2/24 pg_vip_interface: eth1 node_crontab: # make a full backup 1 am everyday - '00 01 * * * postgres /pg/bin/pg-backup full'

声明访问控制

您还可以通过声明式的配置,深度定制 Pigsty 的访问控制能力。例如下面的配置文件对 pg-meta 集群进行了深度安全定制:

使用三节点核心集群模板:crit.yml,确保数据一致性有限,故障切换数据零丢失。 启用了 L2 VIP,并将数据库与连接池的监听地址限制在了 本地环回IP + 内网IP + VIP 三个特定地址。 模板强制启用了 Patroni 的 SSL API,与 Pgbouncer 的 SSL,并在 HBA 规则中强制要求使用 SSL 访问数据库集群。 同时还在 pg_libs 中启用了 $libdir/passwordcheck 扩展,来强制执行密码强度安全策略。

最后,还单独声明了一个 pg-meta-delay 集群,作为 pg-meta 在一个小时前的延迟镜像从库,用于紧急数据误删恢复。

pg-meta: # 3 instance postgres cluster `pg-meta` hosts: 10.10.10.10: { pg_seq: 1, pg_role: primary } 10.10.10.11: { pg_seq: 2, pg_role: replica } 10.10.10.12: { pg_seq: 3, pg_role: replica , pg_offline_query: true } vars: pg_cluster: pg-meta pg_conf: crit.yml pg_users: - { name: dbuser_meta , password: DBUser.Meta , pgbouncer: true , roles: [ dbrole_admin ] , comment: pigsty admin user } - { name: dbuser_view , password: DBUser.Viewer , pgbouncer: true , roles: [ dbrole_readonly ] , comment: read-only viewer for meta database } pg_databases: - {name: meta ,baseline: cmdb.sql ,comment: pigsty meta database ,schemas: [pigsty] ,extensions: [{name: postgis, schema: public}, {name: timescaledb}]} pg_default_service_dest: postgres pg_services: - { name: standby ,src_ip: "*" ,port: 5435 , dest: default ,selector: "[]" , backup: "[? pg_role == `primary`]" } pg_vip_enabled: true pg_vip_address: 10.10.10.2/24 pg_vip_interface: eth1 pg_listen: '${ip},${vip},${lo}' patroni_ssl_enabled: true pgbouncer_sslmode: require pgbackrest_method: minio pg_libs: 'timescaledb, $libdir/passwordcheck, pg_stat_statements, auto_explain' # add passwordcheck extension to enforce strong password pg_default_roles: # default roles and users in postgres cluster - { name: dbrole_readonly ,login: false ,comment: role for global read-only access } - { name: dbrole_offline ,login: false ,comment: role for restricted read-only access } - { name: dbrole_readwrite ,login: false ,roles: [dbrole_readonly] ,comment: role for global read-write access } - { name: dbrole_admin ,login: false ,roles: [pg_monitor, dbrole_readwrite] ,comment: role for object creation } - { name: postgres ,superuser: true ,expire_in: 7300 ,comment: system superuser } - { name: replicator ,replication: true ,expire_in: 7300 ,roles: [pg_monitor, dbrole_readonly] ,comment: system replicator } - { name: dbuser_dba ,superuser: true ,expire_in: 7300 ,roles: [dbrole_admin] ,pgbouncer: true ,pool_mode: session, pool_connlimit: 16 , comment: pgsql admin user } - { name: dbuser_monitor ,roles: [pg_monitor] ,expire_in: 7300 ,pgbouncer: true ,parameters: {log_min_duration_statement: 1000 } ,pool_mode: session ,pool_connlimit: 8 ,comment: pgsql monitor user } pg_default_hba_rules: # postgres host-based auth rules by default - {user: '${dbsu}' ,db: all ,addr: local ,auth: ident ,title: 'dbsu access via local os user ident' } - {user: '${dbsu}' ,db: replication ,addr: local ,auth: ident ,title: 'dbsu replication from local os ident' } - {user: '${repl}' ,db: replication ,addr: localhost ,auth: ssl ,title: 'replicator replication from localhost'} - {user: '${repl}' ,db: replication ,addr: intra ,auth: ssl ,title: 'replicator replication from intranet' } - {user: '${repl}' ,db: postgres ,addr: intra ,auth: ssl ,title: 'replicator postgres db from intranet' } - {user: '${monitor}' ,db: all ,addr: localhost ,auth: pwd ,title: 'monitor from localhost with password' } - {user: '${monitor}' ,db: all ,addr: infra ,auth: ssl ,title: 'monitor from infra host with password'} - {user: '${admin}' ,db: all ,addr: infra ,auth: ssl ,title: 'admin @ infra nodes with pwd & ssl' } - {user: '${admin}' ,db: all ,addr: world ,auth: cert ,title: 'admin @ everywhere with ssl & cert' } - {user: '+dbrole_readonly',db: all ,addr: localhost ,auth: ssl ,title: 'pgbouncer read/write via local socket'} - {user: '+dbrole_readonly',db: all ,addr: intra ,auth: ssl ,title: 'read/write biz user via password' } - {user: '+dbrole_offline' ,db: all ,addr: intra ,auth: ssl ,title: 'allow etl offline tasks from intranet'} pgb_default_hba_rules: # pgbouncer host-based authentication rules - {user: '${dbsu}' ,db: pgbouncer ,addr: local ,auth: peer ,title: 'dbsu local admin access with os ident'} - {user: 'all' ,db: all ,addr: localhost ,auth: pwd ,title: 'allow all user local access with pwd' } - {user: '${monitor}' ,db: pgbouncer ,addr: intra ,auth: ssl ,title: 'monitor access via intranet with pwd' } - {user: '${monitor}' ,db: all ,addr: world ,auth: deny ,title: 'reject all other monitor access addr' } - {user: '${admin}' ,db: all ,addr: intra ,auth: ssl ,title: 'admin access via intranet with pwd' } - {user: '${admin}' ,db: all ,addr: world ,auth: deny ,title: 'reject all other admin access addr' } - {user: 'all' ,db: all ,addr: intra ,auth: ssl ,title: 'allow all user intra access with pwd' } # OPTIONAL delayed cluster for pg-meta pg-meta-delay: # delayed instance for pg-meta (1 hour ago) hosts: { 10.10.10.13: { pg_seq: 1, pg_role: primary, pg_upstream: 10.10.10.10, pg_delay: 1h } } vars: { pg_cluster: pg-meta-delay }

Citus分布式集群

下面是一个四节点的 Citus 分布式集群的声明式配置:

all: children: pg-citus0: # citus coordinator, pg_group = 0 hosts: { 10.10.10.10: { pg_seq: 1, pg_role: primary } } vars: { pg_cluster: pg-citus0 , pg_group: 0 } pg-citus1: # citus data node 1 hosts: { 10.10.10.11: { pg_seq: 1, pg_role: primary } } vars: { pg_cluster: pg-citus1 , pg_group: 1 } pg-citus2: # citus data node 2 hosts: { 10.10.10.12: { pg_seq: 1, pg_role: primary } } vars: { pg_cluster: pg-citus2 , pg_group: 2 } pg-citus3: # citus data node 3, with an extra replica hosts: 10.10.10.13: { pg_seq: 1, pg_role: primary } 10.10.10.14: { pg_seq: 2, pg_role: replica } vars: { pg_cluster: pg-citus3 , pg_group: 3 } vars: # global parameters for all citus clusters pg_mode: citus # pgsql cluster mode: citus pg_shard: pg-citus # citus shard name: pg-citus patroni_citus_db: meta # citus distributed database name pg_dbsu_password: DBUser.Postgres # all dbsu password access for citus cluster pg_users: [ { name: dbuser_meta ,password: DBUser.Meta ,pgbouncer: true ,roles: [ dbrole_admin ] } ] pg_databases: [ { name: meta ,extensions: [ { name: citus }, { name: postgis }, { name: timescaledb } ] } ] pg_hba_rules: - { user: 'all' ,db: all ,addr: 127.0.0.1/32 ,auth: ssl ,title: 'all user ssl access from localhost' } - { user: 'all' ,db: all ,addr: intra ,auth: ssl ,title: 'all user ssl access from intranet' }

Redis集群

下面给出了 Redis 主从集群、哨兵集群、以及 Redis Cluster 的声明配置样例

redis-ms: # redis classic primary & replica hosts: { 10.10.10.10: { redis_node: 1 , redis_instances: { 6379: { }, 6380: { replica_of: '10.10.10.10 6379' } } } } vars: { redis_cluster: redis-ms ,redis_password: 'redis.ms' ,redis_max_memory: 64MB } redis-meta: # redis sentinel x 3 hosts: { 10.10.10.11: { redis_node: 1 , redis_instances: { 26379: { } ,26380: { } ,26381: { } } } } vars: redis_cluster: redis-meta redis_password: 'redis.meta' redis_mode: sentinel redis_max_memory: 16MB redis_sentinel_monitor: # primary list for redis sentinel, use cls as name, primary ip:port - { name: redis-ms, host: 10.10.10.10, port: 6379 ,password: redis.ms, quorum: 2 } redis-test: # redis native cluster: 3m x 3s hosts: 10.10.10.12: { redis_node: 1 ,redis_instances: { 6379: { } ,6380: { } ,6381: { } } } 10.10.10.13: { redis_node: 2 ,redis_instances: { 6379: { } ,6380: { } ,6381: { } } } vars: { redis_cluster: redis-test ,redis_password: 'redis.test' ,redis_mode: cluster, redis_max_memory: 32MB }

ETCD集群

下面给出了一个三节点的 Etcd 集群声明式配置样例:

etcd: # dcs service for postgres/patroni ha consensus hosts: # 1 node for testing, 3 or 5 for production 10.10.10.10: { etcd_seq: 1 } # etcd_seq required 10.10.10.11: { etcd_seq: 2 } # assign from 1 ~ n 10.10.10.12: { etcd_seq: 3 } # odd number please vars: # cluster level parameter override roles/etcd etcd_cluster: etcd # mark etcd cluster name etcd etcd_safeguard: false # safeguard against purging etcd_clean: true # purge etcd during init process

MinIO集群

下面给出了一个三节点的 MinIO 集群声明式配置样例:

minio: hosts: 10.10.10.10: { minio_seq: 1 } 10.10.10.11: { minio_seq: 2 } 10.10.10.12: { minio_seq: 3 } vars: minio_cluster: minio minio_data: '/data{1...2}' # 每个节点使用两块磁盘 minio_node: '${minio_cluster}-${minio_seq}.pigsty' # 节点名称的模式 haproxy_services: - name: minio # [必选] 服务名称,需要唯一 port: 9002 # [必选] 服务端口,需要唯一 options: - option httpchk - option http-keep-alive - http-check send meth OPTIONS uri /minio/health/live - http-check expect status 200 servers: - { name: minio-1 ,ip: 10.10.10.10 , port: 9000 , options: 'check-ssl ca-file /etc/pki/ca.crt check port 9000' } - { name: minio-2 ,ip: 10.10.10.11 , port: 9000 , options: 'check-ssl ca-file /etc/pki/ca.crt check port 9000' } - { name: minio-3 ,ip: 10.10.10.12 , port: 9000 , options: 'check-ssl ca-file /etc/pki/ca.crt check port 9000' }

最后修改 2025-03-21: replace vonng to pgsty (35fb95c)