-
Notifications
You must be signed in to change notification settings - Fork 372
Postgres replication setup and failover
Postgres replication is setup using 2nodes - one node acts as master and other node acts as slave / stand_by node. The replication is setup using streaming replication as explained in https://wiki.postgresql.org/wiki/Streaming_Replication
In case of failure in master, slave node will be promoted to master by creating trigger file as per trigger-file
config in recovery.conf
. More explanation can be found in Replication setup
In normal operation, applications connect to master node. After failover, application configuration should be changed connect to new master node (previous slave node)
We need to check if master can be behind a load balancer which can be pointed to new node after failover to avoid changing application configuration after failover. Other alternative would be using DNS name for master which can be switched to point to new master node after failover
If the master postgres server crashes, follow the below steps
- Run the jenkins job
Postgresql_Slave_To_Master_Promotion
- Change the inventory file in ansible scripts to point
postgresql-master
points to previous slave node. Example if your inventory has
[spike-postgresql-master:children]
spike-postgresql-1
[spike-postgresql-slave:children]
spike-postgresql-2
change it to
[spike-postgresql-master:children]
spike-postgresql-2
- Run the jenkins job
Postgresql_Master
. This ensures new master node allows replication to future slave node(s) - Change the application(s) config to point to new master node
Note: Few application config already uses
postgres_host: {{ groups['postgresql-master'][0]
}}. This would not need any change as inventory was updated in previous step)
- Deploy the application(s) to pick up new configuration
- If you have a new server to act as slave node. Add the new slave in inventory
- Run the jenkins job
Postgresql_Slave
- Run the jenkins job
Monitoring