A critical aspect of maintaining a successful eCommerce presence is the ability to scale to meet demand. A fast website under high load is key to delivering a positive user experience.

The need to scale your website is usually discovered in one of two ways:

The best way:

a) You’ve had strong and consistent growth and have identified the need to add capacity to your site for visitors in a measured way.

And the way that almost everyone finds out:

b) You’ve run a big promotion, appeared on television or a product has been linked to by a celebrity and your website is being crushed by visitors.

There are some tools and services that you can use to help your site deal with peaks, but they all work best when layered on top of well optimised installation. You should be getting an A or B from tools like Google’s PageSpeed Insights, Y!Slow or GTmetrix.

For this you will need minified CSS and JavaScript, optimised images and cache control headers set correctly. This is something you’ll talk to your developers about, but a starting point is the GTspeed plugin, created by GTmetrix (other plugins are available!).

You will also need to look at the resources that your website uses and configure your hosting accordingly. Out of the box, Linux installations are very optimistic about what resources can be used. It’s the default in Ubuntu Linux to configure the Apache web server to handle up to 50 clients. This usually works fine, but when there’s a spike in traffic, you may end up trying to use far more resource than you have.

For example, a Magento site with a couple of plugins can use between 200Mb and 400Mb of RAM per worker. At the default of 50 clients, it will try and use 10 to 20Gb of RAM and if you don’t have that RAM on your server, everything will slow down as it starts to use the disk (swap) to make up for the lack of RAM.

With 50 clients an 8-core server is actually only running 8 clients worth of CPU, too – it’s rapidly switching among those 50 and the net result is that everyone waits longer for their response.

Instead of serving 50 clients very poorly, it’s far better for your site to serve a number that doesn’t massively over-commit your web servers – like 8 clients on a 4 CPU 4Gb RAM virtual machine. The rest of the connections will be just like people queuing to be served.

When you’re not over-committing, the people that you let through will get served quicker, which means that slots will open faster and you’ll serve more people (and those people will have a better experience) than if you tried to do everything at once. (Just imagine checking in at an airport if everyone just walked up to the airline desks at the same time.)

Now that you’re limiting the amount of dynamic pages being processed at any one time, you should make sure that everything else is being served through another route and not using the precious resources that you’re saving for clients interacting: adding to basket and checking out.

Once that groundwork is laid, you can use tools like Varnish Cache as an HTTP accelerator and look at employing a content delivery network (CDN) and a static web server to serve your static assets.

Varnish will efficiently serve all of those static assets: product images, CSS, web fonts, JavaScript and the “not logged in” version of your site. Once you’ve told it (through cache control headers) what it can cache and for how long it will maintain its own copy of those resources. It won’t ask your overworked web server for another copy until either that resource expires or it’s kicked out of the cache by another, more popular, resource.

A CDN and a static web server will perform a similar role: moving anything that doesn’t need to be done by the application’s web server far away from it. Many CDNs work like a proxy and will also cache your assets: Fastly (a popular CDN) is even based on a customised version of Varnish.

These are a great option when you don’t want to (or don’t have time to) set up a dedicated proxy and they have other benefits for site speed, like moving cached copies of your images and assets geographically closer to your customers. Some CDN providers will even offer denial-of-service protection in case it’s not just a lot of customers bringing your website down, but a malicious attack.

What about when all of that isn’t enough? When you’ve optimised what you can and moved as much away from the application web servers as possible, your options come down to getting a bigger server or getting more servers.

At iWeb our standard dedicated hardware has 24 cores and 96Gb of RAM. This could be dedicated to one customer, or split into virtual machines and containers for several customers. Other hosting companies will have other options available, but it’s safe to say that the capacities of dedicated hardware are much higher than virtual machines, containers or shared hosting.

If you’re persistently hitting load issues, having dedicated hardware is still the gold standard for performance and capacity but it’s not without downsides, not least of which is the cost. It’s also quite inefficient unless you really are busy most of the time.

For people experiencing occasional spikes in demand, “scaling out” to several virtual servers is the right choice. Your first extra server should offload anything that isn’t a bottleneck on your web server – this usually means moving databases, Redis and Solr and can be done with minimal changes to the application.

After that, it’s a case of adding more web servers for more capacity. Popular applications like WordPress and Magento expect all of their web servers to share the same document root, so these web servers will need a private network and shared folders set up, too.

This approach is popularly deployed in combination with cloud computing but even when using cloud providers like Amazon a large amount of investment in automation and technical knowledge is required to get the best results.

By this point, your application’s capacity can grow or shrink in units of “one webserver” and whatever capacity that supplies can be easily added and removed. You can add capacity to coincide with marketing events or in emergencies to deal with unexpected spikes. And when things are quiet, you can scale down your application to maintain a cost efficient web presence.