Do you have too many tiers?
Low-tier data centres have the potential to be more efficient: less redundancy means less equipment, less cost, less power. With this in mind, are they a strategic opportunity in today’s climate, asks TOM TOWNSEND, networks and data centre manager, Information and Technology Management (ITM), University of Canberra.
As a society, we still believe that ‘bigger is better’. The data centre industry is no different. However, with ever-increasing IT demands, can we afford to ‘gold plate’ data centres regardless of the criticality of the IT within? To date, data centres have been created to be a fortress of high availability, normally catering to the needs of the most critical system (or client) it hosts. What if, by distributing your IT systems between multiple data centres, significant savings could be made?
This article will show how a pair of low-tier data centres can offer availability with a difference of 45 minutes a year when compared to a tier 3 data centre, at 22 percent less cost over 10 years. Also, a pair of tier 3 data centres can offer superior uptime when compared to a tier 4 at a low cost.
The availability of an individual data centre is only a small piece of the puzzle when planning the resilience and availability of your IT services. Many organisations have multiple facilities at their disposal, whether they be owned and operated, or co-location facilities, but most have all of each IT system at a single site.
Therefore, we ask, what is the impact on the availability of multiple data centres when spreading their IT between facilities? Does having your IT systems split between multiple facilities negate the need for high-tier data centres?
The main goal of this article is to get you questioning preconceptions regarding data centre resilience and to start thinking about how we spread IT systems between your data centres. The challenge is still how to achieve targeted resilience and availability.
In 2012, I worked on evaluating the IT services hosted within a medium-size organisation in terms of disaster recovery and business recovery planning. As a result, we determined only 17 percent of the systems analysed were critical to the running of the business.
In addition to those ‘critical’ systems, another four percent were required to support those systems; i.e. DNS, firewalls. We defined critical systems as any system that, if unavailable for up to two business days, would result in either financial loss or reputational damage.
After taking into account the number of servers used by each system in the entire environment, we found only 10.7 percent of the IT environment was there to support our critical and foundation systems, as shown in the sample division of IT systems graph.
Therefore, you could argue that all of the financial investment into these data centres was for less than 11 percent of our environment. This work raised several questions for me. The most difficult question to answer was, ‘To what degree do we increase availability of our systems by splitting them between several sites?’
In my observations of most medium to large enterprises (the Microsofts and Googles of the world aside), I found that most of them have multiple data centres with the majority of their systems hosted out of one site; even systems set up to be highly available tend to be in a single site. This can be for a number of reasons, one of which is perhaps due to historical technical constraints such as shared SANs. Today, this is less of an issue, with most IT systems having the ability to effectively cluster between sites. The first step to understanding this problem was getting the logic right and making decisions on what aspects to include and exclude in the calculations. We did this by comparing two scenarios allowing us to see the gains and losses between each.
The first scenario includes a highly available IT system with two redundant halves, both in a single data centre with redundant links to the internet. The second scenario takes the same highly available IT system, but this time the IT systems are split between two redundant systems with the same redundant links to the internet.
By taking the commonly acceptable availabilities of each component in these scenarios, the combined availability can be calculated. For example, taking the first scenario with a single data centre, the key figures are: the availability of the data centre facility and the availability of the IT system. In this example, I’ve used facility availability figures from the Uptime Institute for its 4 tier data centre classifications.
For the IT system availability, figures from Gartner for a ‘best of class’ highly available IT system were used. In each calculation, the only changing variable is the tier of data centre. Using these source availabilities from both Uptime (www.uptimeinstitute.com) and Gartner (www.gartner.com) the availability of the complete system can be calculated. See the results for the single data centre scenario in the ‘Single data centre + IT availability’ table.
Table 1 shows reasonably realistic results in terms of the availability difference between each of the four commonly discussed tier levels of data centre. The values from the first scenario serve as a baseline for the dual data centre scenario. Now, how does the picture look when we split the two halves of the same IT system between two data centres?
It is important to remember that each of these availabilities for the second scenario is calculated using the same availability values as the first scenario, with the exception of the data centre availabilities. The most interesting finding of this analysis is that a pair of tier 1 data centres (which are no better than customised office buildings) can achieve almost identical availability to a single tier 3 or 4 data centre, with their availability varying by less than 1.5 hours per year; see the ‘Dual data centre + IT system availability’ in table 2.
The ‘Combined data centre + IT system availability’ table shows all the above figures sorted in the order of availability. Following my calculation, I found a surprisingly small difference between two tier 1 data centres and a pair of tier 4s (2:34:09). However, the major difference was observed when comparing a single tier 1 or tier 2 data centre to any other tier or combination of data centre(s).
This is primarily due to the relatively low availability of even a best-of-class, highly available IT system that, including scheduled downtime, can have several days a year of downtime if all is well. Note the marked improvement going from a single tier 1 and a pair of tier 1s – just over 26 hours (26:12:47). With this in mind, it could be argued that we are overinvesting in data centre availability. Perhaps more effort should be spent on IT system availability instead?
The highest availability was found when combining a pair of tier 4 data centres, which was no surprise. However, what is surprising is that a pair of tier 4 data centres made only an hour and a half improvement in availability over and above a single tier 3 – again primarily due to the IT system availability figures. Another interesting comparison is that a pair of tier 2s or a single tier 2 plus a tier 1 are both within 45 minutes of the single tier 3 in terms of their availability per year.
Following the data centre availability analysis, the next question that needed to be answered is: ‘how much would each of these combinations cost?’
In order to compare each combination a TCO (total cost of ownership) calculator (www.thecloudcalculator.com) was used to estimate the cost over a 10-year period for a data centre of each tier rating to a 350 rack, 1-megawatt facility then, for fair comparison, a half-size 175 rack, 500-kilowatt facility for use for the dual datacentre scenario.
The assumption here is that the IT footprint, regardless of whether it is in a single facility or split between two, is the same. The calculator takes into account the increase in staff and resources that can’t be split between sites, so you’ll note the cost of two small data centres is more than the single data centre with the same IT capacity.
Looking at the ‘Combined data centre + IT system availability Inc TCO’ table, we start to get a more complete picture. Interestingly, this analysis reveals that we can have three different combinations of the data centres (two tier 4s, two tier 3s or a single tier 4) at the same cost of a single ‘large’ tier 3. Also, two smaller tier 3s are only $2.3 million more expensive than a single tier 3, while improving the uptime by almost an hour and a half, and still remaining cheaper than a single tier 4 data centre by $800,000!
If a tier 3 data centre in its own right is either cost prohibitive or not needed for your organisation, there are some interesting options combining low-tier data centres. For example, you can save $15.2 million over 10 years by having a tier 2 plus a tier 1. Keeping in mind that these are theoretical numbers, and assuming the tier 3 is your starting point, that is a massive 22 percent cost reduction over 10-year period.
The idea of multiple smaller-tier data centres isn’t without its challenges. A perfect example of this is the issue of physical security – if your organisation requires heightened security, but doesn’t require the availability offered by a high-tier data centre, then traditional standards may need to be abandoned, or at least examined individually on their own merits instead of taking the one-size-fits-all approach.
In my opinion, there are a number of interesting opportunities ahead of the data centre industry. How are we going to adapt to the ever-changing demands on IT, taking into account the related environment pressures, not to mention diminishing IT budgets?
Any engineer will tell you a low-tier data centre has the potential to be more efficient: less redundancy means less equipment, less cost, less power. With this in mind, are low-tier data centres a strategic opportunity in today’s climate? By thinking differently about how we use our data centres and where we locate our IT, there are significant savings to be made.
The main goal of this article is to get you questioning preconceptions regarding data centre resilience and to start thinking about how we spread IT systems between your data centres.