Sunday, December 10, 2006

Two node LVS-DR setup on CentOS

We've been working on getting our data center up and going, and part of that was setting up an honest to goodness load balanced system with fail over.

If you have gobs of money, the usual way to deal with high availability is to have your n+1 servers fronted by a few F5 load balancers. These take care of keeping track of what servers are up and send traffic only to those. Of course, you then need two of them, so that in case your site goes down things keep working. Only problem, is they are expensive. As in, "we don't list the price anywhere" expensive. (always a sure sign you can't afford it)

So since we aren't built out of money, and we always prefer things that are open and we can poke at, we decided to build our load balancer and high availability solution using the very same two nodes we will be serving traffic from.

The two pieces of software that make this possible (apart from the excellent Linux kernel itself) are heartbeat and ldirector.

Heartbeat takes care of the high availability problem. One box is picked as the primary owner of a service and the other watches and listens. If it no longer hears the primary (or the primary indicates that it is going down), the secondary box takes over the IP address, starts up the services and all is well.

ldirector on the other hand, serves the job of distributing load across a set of nodes. You define which nodes can answer queries, how to tell if they are up, and ldirector takes care of forwarding requests onto them.

By having Heartbeat own ldirector, then you have a high availability load balancer. All for free, all using the same hardware. This costs some CPU, but it is very little, and in our case, the bigger challenge is availability rather than scaling.

Settings this on up CentOS 4.4 is fairly straightforward except for a few gotchas that took some working through.

If you follow the ultra monkey example it will get you most of the way there. The one problem you'll run into though is that when heartbeat starts, it checks that it doesn't own the VIP and will kill your lo:0 loopback on the VIP which you use to serve pages as a working node.

The solution is to write a quick script which takes care of both the arp-problem and the lo:0 problem at once and which is controlled by heartbeat when bringing ldirector up and down.

You'll find my script below. You'll want to edit it to put in your own VIP, but otherwise it should work as is. I recommend copying it into /etc/ha.d/resource.d and naming it something like answer-arp. You can also link /etc/ha.d/resource.d/startstop to it and that will take care of removing and adding lo:0 at the right times so heartbeat doesn't complain or leave you in a state of not being a proper worker node.

To trigger the script to kick in when heartbeat starts or stops ldirector, add this script as a resource in haresources and make sure it's before the implicit IPAddr script. My haresources file looks something like:


# haresources
node1 arp-answer 192.168.0.120 ldirectord::ldirectord.cf


And finally, here's my arp-answer script:

#!/bin/bash
#
# Handles the various init procedures required on RHEL4 systems to
# use LVS-DR.
#
# Specifically:
# - enables forwarding via sysctl
# - shuts down local lo:0 when starting heartbeat to avoid complaints
# - handles turning on/off arp responses via arptables
#
# You'll want to set this resource before IPaddr or IPaddr2 in your haresources
# as otherwise the lo:0 link will be killed by IPaddr.
#
# You can also use this script as the /etc/ha.d/resource.d/startstop script
# to make sure the system is in an ok state when heartbeat starts and a valid
# LVS-DR node when heartbeat stops. (IE, has a VIP on lo:0 and doesn't answer
# arps)
#
# You must set the VIP address to use here:
VIP=192.168.0.120

host=`/bin/hostname`
case "$1" in

# if linked as startstop this is triggered when heartbeat first loads
pre-start)
# turn on forwarding (assumes sysctl.conf has been modified to
# have net.ipv4.ip_forward = 1)
/sbin/sysctl -p

# give up lo:0
/sbin/ifdown lo:0

# don't answer arps yet
/etc/ha.d/rc.d/arptables-noarp-addr_giveip $VIP
;;

# called when this resource becomes active
start)
# give up lo:0
/sbin/ifdown lo:0

# we want to answer arps, do so
/etc/ha.d/rc.d/arptables-noarp-addr_takeip $VIP
;;

# startstop hook, noop
post-start)
;;

# startstop hook, noop
pre-stop)
;;

# called either when heartbeat exits completely, or a resource
# is taken down by heartbeat.
stop|post-stop)
# bring up lo:0 again, this will be our loopback with our VIP
/sbin/ifup lo:0

# but do not answer arps anymore
/etc/ha.d/rc.d/arptables-noarp-addr_giveip $VIP
;;

# required by the resources constract
status)
# make sure that eth0:0 is up AND lo:0 is down
islothere=`/sbin/ifconfig lo:0 | grep $VIP`
iseththere=`/sbin/ifconfig eth0:0 | grep $VIP`

# are we ignoring arps on our VIP?
isarpignore=`/sbin/arptables -L | grep $VIP`

if [ "$islothere" -o "$isarpignore" -o ! "iseththere" ];then
# eth0:0 isnt there or lo:0 is there or we are ignoring arps
echo "LVS-DR director Stopped."
else
echo "LVS-DR director Running."
fi
;;
*)
# Invalid entry.
echo "$0: Usage: $0 {pre-start|start|status|stop|post-stop}"
exit 1
;;
esac