Scaling, in terms of the internet, is a product or service’s ability to expand exponentially to meet need. There are two types of scalability: vertical scalability is the traditional and easiest was to expand – by upgrading the hardware you already own, and horizontal scalability is where you create a network of hardware which can expand (and contract) to suit demand at a given time.
Let’s all face it: the internet is only going to get bigger. Recent events such as the uprisings of Egypt and Libya have proved without doubt that the third world is beginning to utilise the internet in the same way the west has been. Companies are using their servers to calculate evermore complex statistics from ever larger stores of raw data. With this fairly recent onslaught of new users and more complex computation, more and more developers will have to face up to the reality that they will eventually have to scale in some form or another in order for their service or product to remain reliable in a rapidly growing world.
So what does it mean to scale? Scalability is the way in which the application is able to handle a sudden influx of users, or a sudden need for more processing power. Many websites have succumbed to the “digg” or “slashdot” effect, whereby their server’s traffic or resources suddenly spike to the point where the server cannot handle the load. This causes the server to be flooded highly unreliable as it struggles to manage the large amount of data it has to process. Needless to say, if the website you are running is commercial, this could end up with a lot of unhappy customers, or a distinct lack of new customers despite being featured on a large website. Staying “digg proof” can be a matter of your business exceeding and failing.
Why bother scaling horizontally? It is true that scaling vertically is the easy solution – simply boost your ram, add a few cores and instant gratification is yours. The problem with this mindset, however, is that this simply isn’t future proof. One day, your site might get really popular, and you really would have wished you’d had planned ahead.
A great example of this (albeit handled very well by the author), is shown in a blog post by Maciej Cegłowski from bookmarking startup ‘pinboard’. His server setup was minimal at best, and highly-inefficient. He had one massive server handling all traffic, one similarly massive server handling all the Search Engine and DB calls, and one relatively modest server handling email, statistics, feeds, and a slave DB (amongst other things). The DB was in SQL, which is notoriously difficult to scale with, and there was only one email server. When pinboard began receiving an influx of traffic, the infrastructure was quickly brought to its knees, with Maciej citing problems such as lots of writes to the Database, blocking and expensive queries, and multiple single point of failure (only one email server, only one web server, for example). All of this could be avoided by thinking horizontally: multiple points of failure, and easy removal and addition of processing power. Maciej managed to handle the traffic in the end, partly because he resorted to using Amazon EC2 servers (which are cloud servers) to process the massive backlog he had accumulated due to the influx of traffic at great cost. If he had planned for horizontal scalability, he could have just deployed some extra servers to deal with the load.
Other organisations who have famously struggled with scaling include Facebook – the company who grew so fast, their only option was to throw money at the problem with vertical scaling, while trying to invent a parallel solution to the problem by slowly integrating horizontal scaling principals into Facebook’s codebase. Facebook’s notable rollout of Hadoop has allowed them to spend as little as $2-4k per server instead of $10k per server when they were using servers with real-time instances of MySQL.
The other argument often heard is, “I use a virtual solution such as a VPS, I’m able to scale up or down when I need more power, just like in the cloud, but I don’t have to deal with the headache of cloud hosting”. For many, this may indeed be a viable option, but the reality is that while you are able to expand and contract the resources you have at your disposal depending on your load, you are still “trapped” within one single physical machine. What if your site becomes insanely popular overnight, and you go from being comfortable in a small VPS to crashing the entire machine you are hosted within? There’s a good chance your host will simply suspend your account until the load (and potential revenue) subsides. That’s never fun.
The great thing about horizontal scaling is that you will increase the points of failure. “Wait, I don’t want more points of failure!” I hear many cry. The point is, however, is that if you haven’t done something terribly wrong, not all of these “points” will fail at the same time. This means that your application should be able to route around any problems you may have with your servers, or perhaps even your network or data-centre, if you have lots of servers.
In vast quantities, the other great boon from horizontal scaling is that it is much cheaper than vertical scaling. Loads of slow servers are much cheaper than a few high-quality servers. The more load you have, the cheaper (and more headache free) horizontal scaling is in comparison to vertical scaling.
So what can you do to make your website scale horizontally? Put simply, instead of “beefing up”, you begin “beefing out”. Luckily for you, often the code you have written will likely not need to be completely butchered in order to be able to scale horizontally.
The basic premise amongst any web application is to create a “load balancer” – a layer with which the user will be forwarded to a server. This layer ensures that servers distribute the load effectively compared to other servers in order to ensure no single server is given an unfair amount of traffic compared with the rest. If you are using Apache, mod_proxy_balancer is what you probably want to look into: it is an extension to the Apache proxy module that essentially counts what traffic is where, and calculates where it should be. If you are using a more lightweight server like nginx for load-balancing, then you are in luck: nginx supports basic load balancing functionality out of the box, all you need to do is to configure it, by routing traffic to the appropriate servers. The servers the load balancers forward to should be mirror images of each other.
Once you’ve made the decision to go horizontal, you need to prepare your set-up and make sure it performs properly. You can do this by setting up a bunch of machines – more than you’ll need for a day-to-day operation – stress test them (using a service such as blitz.io to fire traffic at your domain) to ensure that the load-balancing is functioning correctly. Start with a small amount of traffic, and ramp-up the traffic until the server becomes unstable, recording the upper limit of your server’s stability. Once you’re satisfied that it is set up correctly, start cloning machines onto new machines, and adding them to the node, while increasing the amount of traffic to see how the servers handle the load. Once you’re satisfied the nodes are functioning correctly, you can now shut down machines you don’t need, safe in the knowledge that if you get an influx of traffic, you can simply add these nodes to the network – and more – as you need them. It’s even possible to automate this by recording the traffic as it streams into the network, and if the traffic is close to the threshold where a new server should be available, add it into the network. Likewise, if the traffic has fallen, remove some servers from the network so that the load is handled well, but our network isn’t overkill.
Straightforward? Not too bad, but we haven’t tackled the most tricky of problems: where do we hold the data? We could have all the servers point to a single database, but then what happens if the single database server goes down? Naturally, the end result will be that the website will be available (which is nice), but because it won’t connect to the database, the website won’t function correctly (which is embarrassing and leads to angry clients/bosses/colleagues). One option is to follow the same basic premise for file-servers: use a distributed database management system (DBMS) such as redis, mongodb, or couchDB, which all have their own merits, or if you are serious about increasing performance under load, perhaps cassandra (which was specifically designed to have no single point of failure, yippee!) or hadoop. The other solution is to use a separate hosted solution for your database. There are plenty hosts available which will manage your load balancing for you, and ensure your database is always available. The downfall of these hosts, however, is that in order to get a good speed, you are locked into using the same hosting as they are, so you are within the same network. Otherwise, you may suffer a real speed loss in speed due to the physical distance between the database and file-server.
While horizontal scaling might be a little inconvenient for some, it almost always pays off in the long run. If you are vying for lots of traffic (hey, who isn’t), or doing some serious number-crunching, it pays to split up the work into lean chunks for lean servers. It’s cheaper, more reliable, and more future proof. Give it a go.
What are your opinions on Scalability? Do crashing servers still keep you up at night? Or is it not really something for you to worry about in the short term?
John Hamelink is a Web developer born and bred in South Lanarkshire, Scotland, and now working for award winning agency Cyber-Duck in North London. Like many of the best developers, John is self-taught and has built up an impressive portfolio of work on both personal and freelance projects. He loves to think outside the box and make innovative solutions to problems. He also likes to teach people new ideas and put his own spin on things. When he’s not in front of his computer, he’s playing his classical guitar or out camping with friends. His website is johnhamelink.com and 0n Twitter he is known as @John_Hamelink.