In this document, we are working with cluster under CentOS 7.2 using mostly
pcs command. It supposes that the
pcsd deamon is enabled and running and
authentication between node is set up (see quick start).
Make sure to experiment/train yourself on a testing plateform. Write your own doc related to your own environment. Check your doc and exercice it on a regular basis.
Here is the command to start the cluster on all existing nodes:
# pcs cluster start --all srv2: Starting Cluster... srv1: Starting Cluster... srv3: Starting Cluster...
Here is the command to stop the cluster on the local node it is executed:
# pcs cluster stop
You can also add a designated node if needed, eg.:
# pcs cluster stop srv2
It stops or move away all the resources on the node, then stops Pacemaker and Corosync.
Note that the cluster forbid you to stop too many nodes so it can keep the quorum:
# pcs cluster stop Error: Stopping the node will cause a loss of the quorum, use --force to override
We just replace
start for the opposite command, to start the
cluster on one node. These two commands are equivalent:
srv2# pcs cluster start # executed on srv2 srv1# pcs cluster start srv2 # executed from srv1
If you want to stop the cluster on all nodes, just add
--all to your command:
# pcs cluster stop --all srv3: Stopping Cluster (pacemaker)... srv2: Stopping Cluster (pacemaker)... srv1: Stopping Cluster (pacemaker)... srv3: Stopping Cluster (corosync)... srv2: Stopping Cluster (corosync)... srv1: Stopping Cluster (corosync)...
This last command is perfectly safe and your cluster will start cleanly when desired.
To avoid moving your resources around during cluster shutdown (eg. if one node is shutting down slowly), ask the cluster to stop the resource first. Eg.:
pcs resource disable pgsql-ha --wait pcs cluster stop --all
On custer startup, you will have to do the opposite action:
pcs cluster start --all --wait pcs resource enable pgsql-ha --wait
In this chapter, we describe how to move the primary role from one node to the other and getting back to the cluster the former primary as a standby.
Here is the command to move the primary role from
# pcs resource move --master pgsql-ha srv2 # pcs resource clear pgsql-ha
That’s it. Note that the former primary became a standby and start replicating with the new primary.
You could add
--wait so the command exits when everything is done. Here is an
example moving back the primary to
# pcs resource move --wait --master pgsql-ha srv1 Resource 'pgsql-ha' is master on node srv1; slave on node srv2. # pcs resource clear pgsql-ha
INFINITY constraint location is set to move the primary role to the given
node. You must clear this constraint to avoid unexpected location behavior
pcs resource clear command.
Giving the destination node is not mandatory. If no destination node is given,
-INFINITY score is set on the primary current node to force it to move away:
# pcs resource move --wait --master pgsql-ha Warning: Creating location constraint cli-ban-pgsql-ha-on-srv1 with a score of -INFINITY for resource pgsql-ha on node srv1. This will prevent pgsql-ha from being promoted on srv1 until the constraint is removed. This will be the case even if srv1 is the last node in the cluster. Resource 'pgsql-ha' is master on node srv2; slave on node srv1. # pcs constraint show | grep Master Disabled on: srv1 (score:-INFINITY) (role: Master) # pcs resource clear pgsql-ha # pcs constraint show | grep Master (nothing)
Updating the PostgreSQL Auto-Failover resource agent does not requires to stop
your PostgreSQL cluster. You just need to make sure the cluster manager do not
decide to run an action while the system updates the
pgsqlms script or the
libraries. It’s quite improbable, but this situation is still possible.
The easiest way to acheive a clean update is to put the whole cluster in maintenance mode and update PAF, eg.:
# pcs property set maintenance-mode=true # yum install -y https://github.com/ClusterLabs/PAF/releases/download/v2.2.1/resource-agents-paf-2.2.1-1.noarch.rpm # pcs property set maintenance-mode=false
That’s it, you are done.
If putting the whole cluster is not an option to you, you must ask the cluster to only ignore and avoid your PostgreSQL resources. The cluster will still be in charge of other resources.
Considers the PostgreSQL multistate resource is called
The following command achieve two goals. The first one forbids the cluster
resource manager to react on unexpected status by putting the resource in
unmanaged mode (
unmanage pgsql-ha). The second one stops the monitor actions
for this resources (
# pcs resource unmanage pgsql-ha --monitor
(unmanaged) appeared in
crm_mon. In the following command, the meta
is-managed=false appeared for the
pgsql-ha resource and
enabled=false appeared for the monitor actions:
# pcs resource show pgsql-ha Master: pgsql-ha Meta Attrs: notify=true Resource: pgsqld (class=ocf provider=heartbeat type=pgsqlms) Attributes: bindir=/usr/pgsql-10/bin pgdata=/var/lib/pgsql/10/data Meta Attrs: is-managed=false Operations: demote interval=0s timeout=120s (pgsqld-demote-interval-0s) methods interval=0s timeout=5 (pgsqld-methods-interval-0s) monitor enabled=false interval=15s role=Master timeout=10s (pgsqld-monitor-interval-15s) monitor enabled=false interval=16s role=Slave timeout=10s (pgsqld-monitor-interval-16s) notify interval=0s timeout=60s (pgsqld-notify-interval-0s) promote interval=0s timeout=30s (pgsqld-promote-interval-0s) reload interval=0s timeout=20 (pgsqld-reload-interval-0s) start interval=0s timeout=60s (pgsqld-start-interval-0s) stop interval=0s timeout=60s (pgsqld-stop-interval-0s)
Now, update PAF, eg.:
# yum install -y https://github.com/ClusterLabs/PAF/releases/download/v2.2.1/resource-agents-paf-2.2.1-1.noarch.rpm
We can now put the resource in
managed mode again and enable the monitor
# pcs resource manage pgsql-ha --monitor
NOTE: you might want to enable monitor action first to check everything is going fine before getting back the control to the cluster. You can enable the monitor actions using the following commands (you must to set all parameters related to the action):
# pcs resource update pgsqld op monitor role=Master timeout=10s interval=15s enabled=true # pcs resource update pgsqld op monitor role=Slave timeout=10s interval=16s enabled=true
Monitor action should be executed immediately and report no errors. Check that everything is running correctly in
crm_monand your log files before enabling the resource itself (without the
# pcs resource manage pgsql-ha
This chapter explains how to do a minor upgrade of PostgreSQL on a two node
cluster. Nodes are called
srv2, the PostgreSQL HA resource is
srv1 is hosting the primary.
The process is quite simple: upgrade the standby first, move the primary role and finally upgrade PostgreSQL on the former PostgreSQL primary node.
Here is how to upgrade PostgeSQL on the standby side:
# yum install --downloadonly postgresql93 postgresql93-contrib postgresql93-server # pcs resource ban --wait pgsql-ha srv2 # yum install -y postgresql93 postgresql93-contrib postgresql93-server # pcs resource clear pgsql-ha
Here are the details of these commands:
pgsql-haresource only from
srv2, effectively stopping it
pgsql-haresource to run on
srv2, effectively starting it
Now, we can move the primary resource to
srv2, then take care of
# pcs resource move --wait --master pgsql-ha srv2 # yum install --downloadonly postgresql93 postgresql93-contrib postgresql93-server # pcs resource ban --wait pgsql-ha srv1 # yum install -y postgresql93 postgresql93-contrib postgresql93-server # pcs resource clear pgsql-ha
Minor upgrade is finished. Feel free to move your primary back to
srv1 if you
really need it.
In this chapter, we add server
srv3 hosting a PostgreSQL standby instance as a
new node in an existing two node cluster.
Setup everything so PostgreSQL can start on
srv3 as a standby and enter in
streaming replication. Remember to create the recovery configuration template
file, setup the
pg_hba.conf file etc.
On this new node, setup the pcsd deamon and its authentication:
# passwd hacluster # systemctl enable pcsd # systemctl start pcsd # pcs cluster auth srv1 srv2 srv3 -u hacluster
On all other nodes, authenticate to the new node:
# pcs cluster auth srv3 -u hacluster
We are now ready to add the new node.
NOTE: Put the cluster in maintenance mode or use crm_simulate if you are afraid that some of your resources move all over the place when the new node appears
# pcs cluster node add srv3
NOTE: If corosync is set up to use multiple network for redundancy, use the following command:
# pcs cluster node add srv3,srv3-alt
Reload the corosync configuration on all the nodes if needed (it shouldn’t, but it doesn’t hurt anyway):
# pcs cluster reload corosync
Fencing is mandatory. See: http://clusterlabs.github.com/PAF/fencing.html.
Either edit the existing fencing resources to handle the new node if applicable,
or add a new one being able to do it. In the example, we are using
fence_virsh fencing agent to create a dedicated fencing resource able to
# pcs stonith create fence_vm_srv3 fence_virsh pcmk_host_check="static-list" \ pcmk_host_list="srv3" ipaddr="192.168.122.1" \ login="<username>" port="srv3-c7" identity_file="/root/.ssh/id_rsa" \ action="off" # pcs constraint location fence_vm_srv3 avoids srv3=INFINITY
We can now start the cluster on
# pcs cluster start
After some time checking cluster using
pcs status, you should
srv3 appearing in the cluster.
If your PosgreSQL standby is not started on the new node, maybe the cluster has
been setup with a hard
clone-max value. Check with:
# pcs resource show pgsql-ha
If you get a value either:
remove it if you don’t mind having a clone on each node:
# pcs resource meta pgsql-ha clone-max=
set it to the needed value:
# pcs resource meta pgsql-ha clone-max=3
Your standby instance should start shortly.
This chapter explains how to remove a node called
srv3 from a three node
The first command will put the node in standby. It stops all resources on the node:
# pcs cluster standby srv3
Next command simply remove the node from the cluster. It stops Pacemaker
srv3̀, remove the cluster setup from it and reconfigure other nodes:
# pcs cluster node remove srv3 srv3: Stopping Cluster (pacemaker)... srv3: Successfully destroyed cluster srv1: Corosync updated srv2: Corosync updated
If you choose to set a specific
clone-max attribute to the
resource, update it. You don’t need to update it if it is not set (see
# pcs resource meta pgsql-ha clone-max=2
First, read the watchdog chapter of the “How to fence your node” documentation page for some theory.
We now explain how to setup a watchdog device as a fencing method in
i6300esb watchdog “hardware” has been added to the virtual
machines in our demo cluster. This hardware is correctly discovered on boot by
Dec 7 14:47:21 srv1 kernel: i6300esb: Intel 6300ESB WatchDog Timer Driver v0.05 Dec 7 14:47:21 srv1 kernel: i6300esb: initialized (0xffffc90000128000). heartbeat=30 sec (nowayout=0)
NOTE: in your test environment, you could use a software watchdog from the Linux kernel called
softdog. This is fine as far as this is just for demo purpose or very last possible solution. You should definitely rely on a hardware watchdog which is not tied to the operating system.
First we need to stop the cluster to set everything up. The watchdog capability is detected by the cluster manager on each node during the cluster startup.
# pcs cluster stop --all
Install and enable
sbd. This small deamon is the glue between the watchdog
device and the inter-communication with Pacemaker:
# yum install -y sbd # systemctl enable sbd.service
/etc/sysconfig/sbd and make sure you have:
SBD_WATCHDOG_DEVpointing to the correct device
You can adjust the value of
SBD_WATCHDOG_TIMEOUT to suit your need. This last
variable is the time sbd will use to initialize the recurrent hardware watchdog
We can now restart the cluster to set everything up. The watchdog capability is detected by the cluster manager on each node during the cluster startup. Start the cluster:
# pcs cluster start --all
After some seconds, the following command should return true:
# pcs property show | grep have-watchdog have-watchdog: true
stonith-watchdog-timeout cluster property:
# pcs property set stonith-watchdog-timeout=10s
A good value for
stonith-watchdog-timeout is the double
Now, if you kill the sbd process, the node should reset itself in less
# killall -9 sbd
Using the following command should ask the remote node to fence itself using its watchdog (if no other fencing device exist):
# stonith_admin -F srv1
If you stop Pacemaker but not Corosync or simulate a resource failing to stop or a resource fatal error, the node should fence itself immediately.
In this chapter, we need to set up a node where no PostgreSQL instance of your cluster is supposed to run. That might be that PostgreSQL is not installed on this node, the instance is part of a different resource cluster, etc.
The following command forbid your multi-state PostgreSQL resource
pgsql-ha to run on node called
# pcs constraint location pgsql-ha rule resource-discovery=never score=-INFINITY \#uname eq srv3
This creates constraint location associated to a rule allowing us to
score=-INFINITY ) the node
\#uname eq srv3 ) for
resource-discovery=never is mandatory here as it
forbid the “probe” action the CRM is usually running to discovers the state of
a resource on a node. On a node where your PostgreSQL cluster is not running,
this “probe” action will fail, leading to bad cluster reactions.
In this chapter, we are using a three node cluster with one PostgreSQL primary instance and two standbys instances.
As usual, we start from the cluster created in the quick start documentation:
pgsql-pri-iplinked to the
See the Quick Start CentOS 7 for more informations.
We want to create two IP addresses with the following properties:
To make this possible, we have to play with the resources co-location scores.
First, let’s add two
IPaddr2 resources called
pgsql-ip-stby2 holding IP addresses
# pcs resource create pgsql-ip-stby1 ocf:heartbeat:IPaddr2 \ cidr_netmask=24 ip=192.168.122.49 op monitor interval=10s \ # pcs resource create pgsql-ip-stby2 ocf:heartbeat:IPaddr2 \ cidr_netmask=24 ip=192.168.122.48 op monitor interval=10s \
We want both IP addresses to avoid co-locating with each other. We add
a co-location constraint so
pgsql-ip-stby1 with a
-20 (higher than the stickiness of the cluster):
# pcs constraint colocation add pgsql-ip-stby2 with pgsql-ip-stby1 -20
NOTE: that means the cluster manager have to start
pgsql-ip-stby1first to decide where
pgsql-ip-stby2should start according to the new scores in the cluster. Also, that means that whenever you move
pgsql-ip-stby1to another node, the cluster might have to stop
pgsql-ip-stby2first and restart it elsewhere depending on new scores.
Now, we add similar co-location constraints to define that each IP address
prefers to run on a node with a standby of
with slave pgsql-ha 100means the IP will prefer to bind with a standby
with pgsql-ha 50means that the IP will prefer to bind with a
We give higher priority to the standbys with the
100 score, but should they
all be stopped, the
50 score push the IP to move to the primary.
# pcs constraint colocation add pgsql-ip-stby1 with slave pgsql-ha 100 # pcs constraint order start pgsql-ha then start pgsql-ip-stby1 kind=Mandatory # pcs constraint colocation add pgsql-ip-stby2 with slave pgsql-ha 100 # pcs constraint order start pgsql-ha then start pgsql-ip-stby2 kind=Mandatory # pcs constraint colocation add pgsql-ip-stby1 with pgsql-ha 50 # pcs constraint colocation add pgsql-ip-stby2 with pgsql-ha 50