Tuesday, 25 May 2010

How To: Load Balancing & Failover With Dual/ Multi WAN / ADSL / Cable Connections on Linux

In many location, including but definitely not limited to India, single ADSL / Cable connections can be unreliable and also may not provide sufficient bandwidth for your purposes. One way to increase reliability and bandwidth of your internet connection is to distribute the load (load balancing) using multiple connections. It is also imperative to have transparent fail-over so routes are automatically adjusted depending on the availability of the connections. With load balancing and fail-over you can have reliable connectivity over two or more unreliable broadband connections (like BSNL or Tata Indicom in India). I present you with the simplest solution to a complex problem with live examples.

Note: Load balancing doesn't increase connection speed for a single connection. Its benefits are realized over multiple connections like in an office environment. The benefits of fail-over are however realized even in a single user environment.

The load balancing mechanism, to be discussed with example below, in Linux caches routes and doesn't provide transparent fail-over support. There are two solutions to incorporate transparent fail over - 1. compiling and using a custom Linux kernel with Julian Anastasov's kernel patches for dead gateway detection or 2. user space script to monitor connections and dynamically change routing information.

Julian Anastasov's patches have two problems:
1. They work only when the first hop gateway is down. In many cases, including ours, the first hop gateway is the adsl modem cum router which is always up. So we need a more robust solution for our purposes.

2. You have to compile a custom kernel with patches. This is somewhat complex procedure with reasonable chances of screwing up something. It also forces you to re-patch the kernel every time you decide to update your kernel. Overall I wouldn't recommend anyone going for kernel patching route unless that is the only option. Also in that case you should look for a rpm based solution (like livna rpm for nVidia drivers) which does it automatically for you.

A better solution is to use a userspace program which monitors your connection and updates routes as necessary. I will provide a script which we use to constantly monitor our connections. It provides transparent fail over support with two ADSL connections. It is fully configurable and can be used for any standard dual ADSL / Cable connections to provide transparent fail over support. It can also be easily modified to use for more than two connections. You can also use it to log uptime / downtime of your connections like we did.

Let's first discuss load balancing with two ADSL / Cable connections and then we will see how to provide transparent fail-over support. The ideas and script provided here can be easily used for more than two connections with minor modifications.

Requirements for Load Balancing multiple ADSL / Cable Connections
1. Obviously you need to have multiple (A)DSL or Cable connections in the first place. Login as root for this job.

2. Find out the LAN / internal IP address of the modems. They may be same like 1921.168.1.1.
Check if the internal / LAN IP address of both (or multiple) modems are same. In that use the web / telnet interface of the modems to configure one of the modems to have a different internal IP address preferably in different networks like 192.168.0.1 or 192.168.2.1 etc. If you are using multiple modems then you should configure each of them to have different subnets. This is important because now you can easily access the different modems from their web interface and you don't have to bother connecting to a modem through a particular interface. It is also important because now you can easily configure the interfaces to be associated with different netmasks / sub-network.

3. Connect each modem to the computer using a different interface (eth0, eth1 etc.). You may be able to use the same interface but this guide doesn't cover that. In short you will make your life complicated using the same interface or even different virtual interface. My recommendation is that you should use one interface per modem. Don't scrimp on cheap ethernet adapters. This has the added benefit of redundancy should one adapter go bad down the road.

4. Configure the IP address of each interface to be in the same sub-network as the modem. For example my modems have IP addresses of 192.168.0.1 and 192.168.1.1. The corresponding addresses & netmasks of the interfaces are: 192.168.0.10 (netmask: 255.255.255.0) and 192.168.1.10 (netmask: 255.255.255.0).

5. Find out the following information before you proceed with the rest of the guide:

IP address of external interfaces (interfaces connected to your modems). This is not the gateway address.
Gateway IP address of each broadband connections. This is the first hop gateway, could be your DSL modem IP address if it has been configured as the gateway following the tip below.
Name, IP address & netmask of external interfaces like eth1, eth2 etc. My external interfaces are eth1 & eth2.
Relative weights you want to assign to each connection. My Tata connection is 4 times faster than BSNL connection. So I assign the weight of 4 to Tata and 1 to BSNL. You must use low positive integer values for weights. For same connection speeds weights of 1 & 1 are appropriate. The weights determine how the load is balanced across multiple connections. In my case Tata is 4 times as likely to be used as route for a particular site in comparison with BSNL.
Note: Refer to Netmask guide for details on netmasks.

Optional step
Check the tips on configuring (A)DSL modems. They are not required for using this guide. However they are beneficial in maximizing your benefits.

How to setup default load balancing for multiple ADSL / Cable connections
Unlike other guides on this topic I will use a real example - the configuration on our internal network. So to begin with here are the basic data for my network:

#IP address of external interfaces. This is not the gateway address.
IP1=192.168.1.10
IP2=192.168.0.10

#Gateway IP addresses. This is the first (hop) gateway, could be your router IP
#address if it has been configured as the gateway
GW1=192.168.1.1
GW2=192.168.0.1

# Relative weights of routes. Keep this to a low integer value. I am using 4
# for TATA connection because it is 4 times faster
W1=1
W2=4

# Broadband providers name; use your own names here.
NAME1=bsnl
NAME2=tata

You must change the example below to use your own IP addresses and other details. Even with that inconvenience a real example is much easier to understand than examples with complex notations. The example given below is copy-pasted from our intranet configuration. It works perfectly as advertised.

Note: In this step fail-over is not addressed. It is provided later with a script which runs on startup.

First you need to create two (or more) routes in the routing table ( /etc/iproute2/rt_tables ). Open the file and make changes similar to what is show below. I added the following for my two connections:

1 bsnl
2 tata

To add a default load balancing route for our outgoing traffic using our dual internet connections (ADSL broadband connections from BSNL & Tata Indicom) here are the lines I included in rc.local file:

ip route add 192.168.1.0/24 dev eth1 src 192.168.1.10 table bsnl
ip route add default via 192.168.1.1 table bsnl
ip route add 192.168.0.0/24 dev eth2 src 192.168.0.10 table tata
ip route add default via 192.168.0.1 table tata
ip rule add from 192.168.1.10 table bsnl
ip rule add from 192.168.0.10 table tata
ip route add default scope global nexthop via 192.168.1.1 dev eth1 weight 1 nexthop via 192.168.0.1 dev eth2 weight 4

Adding them to rc.local ensures that they are execute automatically on startup. You can also run them manually from the command line.

This completes the load balancing part. Let's now see how we can achieve fail-over so the routes are automatically changed when one or more connections are down and then changed again when one or more more connections come back up again. To do this magic I used a script.

How to setup fail-over over multiple load balanced ADSL / Cable connections
Please follow the steps below and preferably in the same order:

First download the script which checks for and provides fail-over over dual ADSL / Cable internet connections and save it to /usr/sbin directory (or any other directory which is mounted available while loading the OS).
Change the file permissions to 755:
chmod 755 /usr/sbin/gwping
Open the file (as root) in an editor like vi or gedit and edit the following parameters for your environment:
#IP Address or domain name to ping. The script relies on the domain being pingable and always available
TESTIP=www.yahoo.com

#Ping timeout in seconds
TIMEOUT=2

# External interfaces
EXTIF1=eth1
EXTIF2=eth2

#IP address of external interfaces. This is not the gateway address.
IP1=192.168.1.10
IP2=192.168.0.10

#Gateway IP addresses. This is the first (hop) gateway, could be your router IP
#address if it has been configured as the gateway
GW1=192.168.1.1
GW2=192.168.0.1

# Relative weights of routes. Keep this to a low integer value. I am using 4
# for TATA connection because it is 4 times faster
W1=1
W2=4

# Broadband providers name; use your own names here.
NAME1=BSNL
NAME2=TATA

#No of repeats of success or failure before changing status of connection
SUCCESSREPEATCOUNT=4
FAILUREREPEATCOUNT=1

Note: Four consecutive success indicates that the gateway is up and one (consecutive) failure indicates that the gateway went down for my environment. You may want to modify it to better match your environment.

Add the following line to the end of /etc/rc.local file:
nohup /usr/sbin/gwping &
In the end my /etc/rc.local file has the following lines added in total:

ip route add 192.168.1.0/24 dev eth1 src 192.168.1.10 table bsnl
ip route add default via 192.168.1.1 table bsnl
ip route add 192.168.0.0/24 dev eth2 src 192.168.0.10 table tata
ip route add default via 192.168.0.1 table tata
ip rule add from 192.168.1.10 table bsnl
ip rule add from 192.168.0.10 table tata
ip route add default scope global nexthop via 192.168.1.1 dev eth1 weight 1 nexthop via 192.168.0.1 dev eth2 weight 4
nohup /usr/sbin/gwping &

An astute reader may note that the default setup with dual load balanced routing (7th line) is really not required as the script is configured to force routing based on the current status the very first time. However it is there to ensure proper routing before the script forces the routing for the first time which is about 40 seconds in my setup (can you tell why it takes 40 second for the first time?).

Concluding thoughts
In the process of finding and coding the simple solution above, I read several documents on routing including the famous lartc how-to (many of whose commands didn't work as described on my Fedora Core system) & nano.txt among several others. I think I have described the simplest possible solution for load balancing and transparent failover of two or more DSL / Cable connections from one or more providers where channel bonding is not provided upstream (requires cooperation from one or more DSL providers); which is the most common scenario. I would welcome suggestions and improvements to this document.

The solution has been well tested in multiple real and artificial load condition and works extremely well with users never realizing when a connection went down or came back up again.

Networking is a complex thing and it is conceivable that you may run into issues not covered here. Feel free to post your problems and solutions here. However, while I would like to, I will not be able to debug and solve individual problems due to time constraints.

I may however be able to offer useful suggestions to your unique problems. It may however be noted that I respond well to CafĂ© Estima Blend™ by Starbucks and move much quicker on my todo list. It is also great as a token of appreciation for my hard work. The "velvety smooth and balanced with a roasty-sweet flavor this blend of coffees is a product of the relationships formed between" us.

In a followup article I discussed how to configure single / dual / multiple ADSL / cable connections, firewall, gateway / NAT With Shorewall Firewall.

No comments:

Post a Comment