By Brett Ridley, Head of Central Operations and Facility Management - As published on the Uptime Institute Blog
Data centres provide much of the connectivity and raw computing power that drives the connected economy. Government departments, financial institutions and large corporates usually run at least some of their own IT infrastructure in-house but it's becoming more common for them to outsource their mission critical infrastructure to a certified high-availability colocation data centre.
It wasn't always like this. In the past, if a large organisation wanted to ensure maximum uptime, they would hire specialist engineers to design a server room in their corporate headquarters and invest capital to strengthen floors, secure doors and ensure sufficient supplies of connectivity and electricity.
So what changed?
For a start, the reliance on technology and connectivity has never been greater. More and more applications are mission critical and organisations are less tolerant of outages even for their secondary systems. In addition, advances in processor technology have resulted in much faster, smaller and denser servers. As servers got smaller and demand for computing and storage increased, organisations would pack more and more computing power into their server rooms. Server rooms also started growing, taking over whole floors or even entire buildings.
As the density of computing power inside the data centre increased, the power and cooling requirements become more specialised. If you have a room with a hundred regular servers and it affects the room temperature, maybe a portable A/C would keep staff happy because the servers wouldn't need any additional cooling.
However, if that same room had a hundred of the latest multi-core blade server cabinets, not only would its power requirements have increased exponentially, in order to deal with the sheer amount of heat generated by the servers, the room would need to be fitted with specialist cooling and ventilation systems in order to avoid a complete hardware meltdown.
At this point, relatively few organisations find it desirable or cost effective to run their own data centre facilities.
Ensuring the infrastructure of a dense computing data centre is designed and maintained to a level where it is completely reliable is an ongoing, time consuming, tedious and extremely expensive process. It requires specialist, dedicated staff backed up by a committed management with deep pockets.
In the data centre world this is known as 'operational sustainability', and it's the primary goal of all large data centre managers.
A list of requirements describing the best practices to ensure the operational sustainability of data centres have been developed by Uptime Institute, which was established with a mission to offer the most authoritative, evidence-based, and unbiased guidance to help global companies improve the performance, efficiency, and reliability of their business critical infrastructure.
More than 1,500 data centres have been certified by Uptime Institute, which meticulously examines every component of the data centre including its design, construction, management and procedures. Once an assessment is positively completed, the data centre is then certified with the appropriate Tier rating.
Tier I - Basic site infrastructure
Tier II - Redundant capacity components site infrastructure
Tier III - Concurrently maintainable site infrastructure
Tier IV - Fault tolerant site infrastructure
Within these Tier ratings, data centres are also awarded a Bronze, Silver and Gold certificates for their operational sustainability practices.
We're extremely passionate about the data centres we build and operate and we are totally obsessed with ensuring they are staffed and maintained in an environment that minimises human error.
NEXTDC has become the first data centre operator in the Southern Hemisphere to achieve Tier IV Gold Certification of Operational Sustainability by Uptime Institute - NEXTDC's B2 data centre in Brisbane received the Tier IV Gold Certification, highlighting the company's excellence in managing long-term operational risks and behaviours, and showcasing its commitment to customers to be robustly reliable, highly efficient and ensuring 100% uptime.
The Gold Operational Sustainability standard recognises the human factors in running a data centre to meet fault tolerant standards. It includes climate-change preparedness and the growing need for edge computing, outage risk mitigation, energy efficiency, increasing rack density, and staffing trends. Achieving Gold certification requires a score of greater than 90% in all areas, Silver is 80%-89% and Bronze is 70%-79%.
The physical design and construction of a data centre can be solid but that's only two-thirds of the story. Human error is the biggest challenge we face when it comes to outages, with around 80% of issues being sighted as accidental.
If staff are not properly trained and the correct processes are not in place, it doesn't matter if the building and hardware are perfect, it's only a matter of time before an outage will strike. This is why NEXTDC invests hundreds of thousands of dollars every year to educate its staff, partners and vendors in an effort to maximise operational sustainability.
Other data centres may claim they have procedures and trained staff but unless they're regularly assessed by an independent third party and benchmarked against the best data centres on the planet, their claims are worthless.
To qualify for even the lowest Bronze certificate, data centres need to establish training programs for all of their staff, however, Uptime Institute also examines any risks posed by other users of our facility - the clients and partners.
We have quarterly training sessions for our operations team - our staff are tested and trained like no one in the industry and Uptime Institute requires evidence of that. They want to know about all the testing and training carried out, they want to know if we have hired any new staff and they will check to ensure new hires have completed the necessary training and a competency-based assessment.
We need to know that our national partners, for example someone like Nilsen Networks, know what they're doing. We hold regular training days for them on our MOPs (Method of Operations) and are required to show evidence of this training to the Uptime Institute. They also want to examine our maintenance records to show that the procedures are being followed to the letter.
We have a procedure for everything - it's all written down and laid out. We've colour-code our folders, we've got the command centre set up and we make our staff and partners practice over and over and over again to ensure that, during an emergency, when stress levels are high, they are far less likely to make costly mistakes.
This dedication to details in the whole process, from design, construction, staffing and maintenance of our facilities is what sets NEXTDC apart from alternative data centre operators. We pay attention to all the details to ensure that your business remains connected and is available 100% of the time.
It's the extra sleep-at-night factor that you don't get with anybody else. The training and skillset of NEXTDC staff matches the design and engineering excellence of our buildings.
L-R NEXTDC employees Alan Colley, Brett Ridley and Andrew Butler