Failover and Load Balancing using HAProxy

HAProxy is open source proxy that can be used to enable high availability and load balancing for web applications. It was designed especially for high load projects so it is very fast and predictable, HAProxy is based on single-process model.

In this post I’ll describe sample setup of HAProxy: users’ requests are load balanced between two web servers Web1 and Web1, if one of them goes down then all the request are processed by alive server, once dead servers recovers load balancing enables again. See topology to the right.

Installation

HAProxy is included into repositories for major Linux distributions, so if you’re using Centos, Redhat or Fedora type the following command:

yum install haproxy

If you’re Ubuntu, Debian or Linux Mint user use this one instead:

apt-get install haproxy

Configuration

As soon as HAProxy is installed it’s time to edit its configuration file, usually it’s placed in /etc/haproxy/haproxy.cfg. Official documentation for HAProxy 1.4 (stable) is here.

Here is configuration file to implement setup shown at the diagram and described above:

global
        user daemon
        group daemon
        daemon
        log 127.0.0.1 daemon
 
listen http
        bind 1.2.3.4:80
        mode http
        option tcplog
 
        log global
        option dontlognull
 
        balance roundrobin
        clitimeout 60000
        srvtimeout 60000
        contimeout 5000
        retries 3
        server web1 web1.example.com:80 check
        server web2 web2.example.com:80 check
        cookie web1 insert nocache
        cookie web2 insert nocache

Let’s stop on most important parts of this configuration file. Section global specifies user and group which will be used to run haproxy process (daemon in our example). Line daemon tells HAProxy to run in background, log 127.0.0.1 daemon specifies syslog facility for sending logs from HAProxy.

Section listen http contains line bind 1.2.3.4:80 that specifies IP address and port that will be used to accept users’ requests (they will be load balanced between Web1 and Web2). Line mode http means that HAProxy will filter all requests different from HTTP and will do load balancing over HTTP protocol.

Line balance roundrobin specifies load balancing algorithm according to which each web server (Web1 and Web2) will be used in turns according to their weights. In our example weights for both servers are the same so load balancing is fair.

Lines server web1 … and server web2 … specify web servers for load balancing and failover, in our case they are load balanced according to round robin algorithm and have the same priority/weight.

The last two lines in configuration files are optional, they makes it possible to preserve cookies, it means for example that if you logged in to web application hosted at Web1 and then HAProxy forwarded your next request to Web2 you will still have logged in session opened as cookies with session id from Web1 will be sent to you from Web2 as well.