Internet hosting: the cost of reliability

[This is the second in a series of three articles to help people understand how internet hosting services work from a business perspective. They’re written for my small business clients over at Prussia.Net as part of a review of our own internet hosting service, but I’m hoping they’ll be of general interest. Enjoy.]

Prussia.Net logo

As I explained yesterday, the big cost in providing internet hosting is paying humans to provide support. However there are still some technical factors that affect the price, and that’s what we’re looking at today.

Most internet hosting customers would be familiar with the usual measures: the amount of storage space you get and the amount of data transfers (“bandwidth”) per month. Those raw measures of capacity are certainly important. You need enough capacity to meet your needs. But you also need to consider performance, reliability, scalability and flexibility.

Performance. The key issues here are whether you’re sharing a server or have your own, the performance of that server, and the performance of the network it’s connected to.

At the lower end of the market, shared web hosting means your website and mailboxes are sharing a computer with other customers — sometimes dozens, hundreds or even thousands. If you want a server computer just for your business, that’s called dedicated hosting. Obviously it costs more, but it does mean you have the computer’s total capacity. Other customers’ usage won’t affect your website. It also reduces the security risk.

There are other systems such as clustered hosting, where the load of many sites is spread across multiple computers, and virtual private servers, where it looks like you have a dedicated server but it’s being simulated — but that’s all outside the scope of this article.

Whether it’s shared or dedicated hosting, the server’s performance can make a difference. The apparent speed of your website will in part depend on the speed of the server’s hard drives and processors, the amount of memory (RAM) it has and so on, as well as the capacity of its network link. Cheap hosting providers may put many, many customers onto a relatively low-grade computer with poor network links. And a cheap data centre may provide less network capacity for a given number of customer websites.

Prussia.Net has been offering shared hosting with around 50 accounts and a total of 170 domains running on a relatively modest server with a Pentium 4 2.66GHz processor and 2GB of RAM. While this sounds small compared with a desktop computer, remember that servers don’t have to run a graphical interface. That said, this server is reaching capacity and that’s one of the factors that led us to review what we do. The server is in ServePath‘s data centre in San Francisco, which is provided with high-capacity data links to the internet.

Reliability. A system’s reliability is measured by its uptime, the percentage of time for which the system has been “up” and running. Sometimes it’s measured in the “number of nines”, for example “four nines” being 99.99% reliable.

Many hosting providers advertise 99% reliability, which sounds good until you realise that you could endure more that 7 hours of downtime per month and still be getting the service you’re paying for. That’s not good if those 7 hours take out a busy working day.

“Four nines” or 99.99% reliability is 4 minutes 23 seconds of downtime per month, and “five nines” or 99.999% reliability is a mere 26 seconds of downtime per month, or a little over 5 minutes in total per year.

Setting up such highly-reliable systems obviously takes engineering skills, planning and money.

Many hosting providers advertise 99% or 99.5% or 99.9% reliability, excluding “scheduled downtime” for systems maintenance. If you want higher reliability then you can expect to pay much, much more money. If a problem has to be fixed within five minutes, you can’t rely on someone responding to a complaint and then trying to work out how to fix things. Backup systems have to be set up in advance, with automated monitoring ready to switch everything over in the event of a failure.

Many hosting customers forget that even if their hosting server is, say, 99.9% reliable, the overall reliability of their website or email will depend on how their website has been built and what arrangements they’ve made for their web developer to be available to fix problems. The hosting server could still be 99.9% reliably serving out a broken website!

It’s also easy to forget that even if a hosting server is “guaranteed” to be 99.9% reliable, that may just mean you get your $29 monthly fee refunded if things go wrong. Again, not good if being offline for an hour means you’re losing hundreds of dollars.

Other hosting providers don’t specify an exact target reliability level, but simply take reasonable steps to keep things going. This is called “best effort” reliability. While “best effort” hosting is often quite reliable, there are no guarantees.

Prussia.Net’s data centre, ServePath promises a 10,000% Guaranteed®, 100% Uptime Service Level Agreement, which means that for every minute their network in unavailable they refund us 100 minutes’ worth of our monthly fees. However Prussia.Net itself offers only “best efforts” reliability, as we don’t have automated monitoring systems. In practice, we’ve experienced 133 minutes of unscheduled downtime in the last six months, which is about 99.95% reliability — but that’s more through good luck than planning.

Scalability. To give an overly-simply explanation, this is about how your internet hosting can cope with sudden increases in demand — for example if your website suddenly becomes vastly more popular than you expect, or there’s a sudden increase in email traffic. A cheap hosting provider might be running everything very close to full capacity, which means a sudden surge in traffic will cause everything to fall over.

There’s also a business angle to this. The hosting provider might offer a certain amount of base capacity, but anything over your pre-booked capacity might still be delivered — but at a vastly higher price than if you’d organised it in advance.

Hosting can also be provided “on demand” or, to use the current buzzword, though cloud computing. This is where the data centre automatically allocates more capacity as it’s needed, and just bills for usage. This is rapidly becoming the preferred method.

Prussia.Net’s hosting server is moderately loaded. We’ve coped with surges of email 10x their normal levels without problem. However this week we saw a massive spam surge at 32x normal levels and we struggled — although this was the biggest spam surge we’ve ever seen in more than a decade of operation. I’ll write more about that soon. I’m seriously considering just on-selling cloud services instead.

Flexibility. A computer can be configured any way you want. However to make it easier to sell its services a hosting provider will usually offer only a certain set of pre-defined options. This keeps the cost down, as staff just choose from a list. Some providers will be more willing to customise the set-up, but that will always be more expensive.

Prussia.Net has always been willing to customise a client’s hosting account however they want. Indeed, this was originally one of the key differentiators of our service. However this has meant keeping prices high.

You may well be looking at this and saying, “I’m a business manager. I don’t care about these technical details. I just want things to work.” What you’re looking for, then, is a “managed service”.

A hosting provider is really just renting out capacity on a computer or multiple computers in a data centre. Questions about what options are right for your business isn’t their concern. That’s the job of your CIO or your IT Manager. “But,” you say, “I’m a small business and I want someone else to figure this out.”

That’ll be the topic of the next article in this series, “IT support vs management vs consulting””.

Comments please. This is very much a first draft of my thoughts on this topic. If you have any questions or comments, please let me know.

3 Replies to “Internet hosting: the cost of reliability”

  1. A couple of shared hosting gotchas to watch out for:

    1) speed. Make sure your inexpensive shared webhosting doesn’t come with the downside of slow webserver speeds, or reduced performance when other users of the shared host get busy or do complicated maintenance – Google now use web page load speeds as part of their weighting algorithm ( http://www.mattcutts.com/blog/site-speed/ )

    2) security. Security on large shared webhosts is important when any of those shared hosting customers could be considered malicious. ( http://www.theregister.co.uk/2010/04/12/network_solutions_wordpress_hack/ )

    big

  2. @Big: Yep, I have to agree with both of those points about shared hosting. Shared hosting is like living in a share house, and your ability to run your business online will depend in part on the behaviour of others — especially if the house is over-crowded.

    On speed, it’s going to be difficult for the customer to know just how much spare capacity the shared server has, since suppliers rarely talk about that. This is one case where a big supplier is often at an advantage, especially of with a cloud computing environment, because they can just turn up the capacity.

    On security, it’s not just worrying about whether your shared-server housemates are malicious, but whether they’re even just slack or incompetent.

    The server’s first line of defence is the passwords on everyone’s user accounts. If someone has chosen a password poorly, or given it out to a contractor who’s leaked it or otherwise allowed someone onto the server who shouldn’t be there, then an attacker can install a rootkit and take over the computer.

    A very common attack is through software which hasn’t been maintained. If another customer installs, say, the WordPress content management system and then doesn’t keep it up to date, an attacker can install software to, say, send spam. This has happened twice with our server over the years. Our data centre monitors us closely, and takes the server offline if it starts sending spam — and that means all 50 customers suffer an outage of 2 or 3 hours while we disinfect the machine simply because one person was too lazy to do a 10-minute upgrade.

Comments are closed.