HA in Linux is pretty easy
NOTICE: -- Very important:
- Please do NOT FAT FINGER. Check your IP address/hostname correct first before configuring PCS.
- If any service(s) do(es) not startup correctly, please check the service log under /var/log/cluster or /var/log/pcsd. systemctl -xe does not help at all
1> Install PCS
- yum isntall pcs -y
2> Configure PCS. Please run the following commands on ALL nodes:
- systemctl start pcsd
- systemctl enable pcsd
- passwd hacluster
3> Configure PCS, Run he following commands on ONE node:
- pcs cluster auth <NODE-1> <NODE-2> <...>
- pcs cluster setup --name <CLUSTER_NAME> <NODE-1> <NODE-2> <...>
- pcs cluster start --all
- pcs cluster enable --all
- pcs resource create virtual_ip ocf:heartbeat:IPaddr2 ip=<IP_ADDR> cidr_netmask=32 op monitor interval=30s
- pcs resource create <APPLICATION_MON> ocf:heartbeat:nginx configfile=/etc/nginx/nginx.conf op monitor timeout="5s" interval="5s" (Please make a note, in here, I am using nginx app)
- pcs property set stonith-enabled=false
- pcs property set no-quorum-policy=ignore
- pcs constraint colocation add <APPLICATION_MON> virtual_ip INFINITY
- pcs constraint order virtual_ip then <APPLICATION_MON>
- pcs cluster stop --all
- pcs cluster start --all
The find output should looks like:
[root@node1 nginx]# pcs status
Cluster name: csr-proxy-ha
Stack: corosync
Current DC: hq-csr-proxy-3 (version 1.1.18-11.el7_5.3-2b07d5c5a9) - partition with quorum
Last updated: Thu Nov 15 10:59:40 2018
Last change: Thu Nov 15 10:58:13 2018 by root via cibadmin on node1
2 nodes configured
2 resources configured
Online: [ node1 node2]
Full list of resources:
virtual_ip (ocf::heartbeat:IPaddr2): Started node1
proxy (ocf::heartbeat:nginx): Starting node1
Daemon Status:
corosync: active/enabled
pacemaker: active/enabled
pcsd: active/enabled
Additional troubleshoot:
ERROR message:
[root@NODE1 ~]# pcs cluster start --all
NODE2: Error connecting to NODE2 - (HTTP error: 400)
NODE1: Starting Cluster...
Error: unable to start all nodes
NODE2: Error connecting to NODE2- (HTTP error: 400)
Cause:
Check your IP address/hostname to make a sure correct spell
you might use this command:
host <node1>
ERROR message:
virtual_ip (ocf::heartbeat:IPaddr2): Stopped
Failed Actions:
* virtual_ip_start_0 on NODE1 'unknown error' (1): call=6, status=complete, exitreason='Unable to find nic or netmask.',
last-rc-change='Thu Nov 15 10:31:55 2018', queued=0ms, exec=454ms
* virtual_ip_start_0 on NODE2 'unknown error' (1): call=6, status=complete, exitreason='Unable to find nic or netmask.',
Cause:
Check the VIP if in the same subnet of each node/
Comments