S3 is a good platform to save files without having to worry about storage ,connections and bandwidth.

S3 works on the idea of having buckets of content.The content of the bucket can be private or public or even access control managed.Even if you have a small amounting traffic on your website and the access patterns are every sparse the cost of S3 is low,Moreover Amazon gives about 5gb free every month.But once you reach a scale where you transfer 100 of gigabytes every month the cost per gigabyte for Amazon turns out very high.

Since I was part of a startup with very little cash to spare I came up with this an idea to save bandwidth without having to change the backend or migrate to another provider.The gist of the idea was to set up a reverse proxy server in front of the Amazon servers so every request gets to our proxy server first.Our proxy server in turns makes calls to the Amazon server to get the relevant files.Since a single server cannot handle a bunch load of requests, we also set up a load balancer which distributes requests based on the source ip address.

First we start by creating our amazon cache server.Here is the docker file for it.
The dockerfile creates the nginx servers which proxies traffic from s3.To build the docker image run ‘docker build -t s3cache .’
This config file sets up the reverse proxy and cache configuration for the nginx server.Make sure you replace bucket with your bucket name.This configuration is only applicable to public buckets without authentication.

Lastly we setup the haproxy server.

docker pull dockerfile/haproxy


You will also have to change the `server ip` to the nginx ip address.To get the ip address of a container run the following command

docker inspect -f "{{.NetworkSettings.IPAddress}}" <container id>

 

Once you get the ipaddress replace the `server ip` with it and run the following.This will run the docker image.

docker run -d -p 80:80 -p 8080:8080 -v &lt;dir&gt;:/haproxy-override dockerfile/haproxy

 

Replace dir with the directory that contains the haproxy.cfg file.
To check the status of your haproxy server visit this link
http://dockerip:8080/haproxy?stats
The username is admin and the password is password
Try making some requests you will notice that the requests will always go through the same server due to your ip address.



If you are planning to scale your website to accommodate for extra traffic on the website.You have two options you can scale vertically or horizontally.
Vertical scaling usually involves adding extra ram/processing power which may be desired,but there is a limit to this kind of scaling.
Horizontal scaling is were you setup multiple machines and they work together to get the work done.Haproxy is loadbalancer which will let you scale your infrastructure horizontally.

To setup loadbalancing between 2 machines is pretty simple.Firstly you will require two machines with static address or a DNS entry to identify the machines.

I will using vagrant during this process of setting up a loadbalancer.I will be using vagrant a virtual machine environment which does have a gui.More information about it can be found at http://vagrantup.com/.

I have setup up virtual machines running ubuntu 13.04 LTS.Both the machines are running Apache2 with default settings.You can replicate the same by running this on the terminal.

sudo apt-get update
sudo apt-get install apache2 php5
sudo apt-get install libapache2-mod-php5
sudo service apache2 restart

Next we get the ip address of the two machines and store them someplace safely.

ifconfig eth0

The ip address of machine one is 10.0.2.2 and the second machine is 10.0.2.3.

We them launch another vm with haproxy.

Run sudo apt-get install haproxy

This will install haproxy with default settings.
Next to start haproxy as service we have to enable it in the configuration file.

vi /etc/default/haproxy
ENABLED=1
sudo service haproxy restart

Next we haproxy to point to the two servers we setup earlier.Next we move the original configuration of haproxy and keep it as a backup.

sudo mv /etc/haproxy/haproxy.cfg /etc/haproxy/haproxy.bak

We then create our own configuration file with

sudo vi /etc/haproxy/haproxy.cfg

Add this to the cfg file.

global
    log 127.0.0.1 local0 notice
    maxconn 2000
    user haproxy
    group haproxy

listen appname 0.0.0.0:80
    mode http
    stats enable
    cookie SRVNAME insert
    balance roundrobin
    option httpclose
    option forwardfor
    server server1 10.0.2.2:80 check
    server server1 10.0.2.3:80 check

Once this is done a quick restart of haproxy and the haproxy acts as the frontend for the incoming traffic.