Temperatures May Be Cooling but Heat-Induced Failure Should Still Be a Hot Topic

By Dave King, Senior Product Manager, Cadence.

The global heatwave that hit the UK this summer, causing temperatures to soar to highs of 40°C and plunging many parts of the UK into droughts, presented challenges for every industry, not least data centers. Facility managers grappled, and in several instances failed, to keep data centers cool to avoid outages, while also facing difficulties with the limited water supplies that are a critical part of the cooling infrastructure.

The sad reality is this extreme weather, and its repercussions, are not isolated incidents. The Met Office has predicted that, in the next 50 years, top summer temperatures could be 4-7°C hotter, with a significantly higher likelihood of jumping above 40°C.

As autumn draws in, the weather may be getting colder in the short term, but we can’t afford our focus on this hot issue to cool off. UK data centers, and indeed their counterparts around the world, need to be taking this brief respite to evolve and manage the ongoing challenges rising temperatures will create. Introducing new technology to facilitate this evolution will be critical to success.

This summer’s hottest trend – heat-induced failure

Faced with record-breaking temperatures globally this summer, many data centers around the world experienced heat-induced failures as they were pushed beyond their limits. Several companies on both sides of the Atlantic had to take the decision to shut down their data centers to repair cooling mechanisms and avoid systems being damaged. This resulted in countless hours of disruption while services were restored and associated “long tail” issues were overcome.

These challenges were not unique to the UK. However, it’s undeniable that the country was less prepared for the swings to high temperatures that are more common elsewhere in the world, and that this resulted in significant consequences for critical infrastructure and the businesses reliant upon them. For example, hospitals were unable to access patients’ medical records, necessitating the cancellation of appointments and further deepening a backlog that has been struggled with since the beginning of the pandemic.

It’s clear that the results of these outages are far from inconsequential and that the cause was systems struggling to cope with extreme heat that they were not designed to handle. Retrofitting existing data center systems, and designing new facilities with this in mind, is essential to better deal with the extreme, hot weather we can expect to become commonplace over the next 50 years.

The coolest data center innovation – digital twins

Crucially, in today’s climate, it’s paramount that data center managers can operate cooling systems as efficiently as possible to prevent outages. Deploying a digital twin – which is a 3D virtual replica of a physical data center that can visualize a facility under any operating scenario – is an effective way to optimize cooling systems. Using this technology, managers can simulate the cooling loop and airflow from IT to the facility level to ensure it’s working as effectively as possible and meeting thermal demands. Furthermore, testing these configurations virtually first mitigates the risks of changing complex cooling systems in the physical facility to prevent outages.

To take a more specific example of how digital twins can support the design of new facilities and optimize their cooling systems, let’s look at Kao Data. Kao Data invested in the technology to make informed decisions around the design, implementation, and operations of an indirect evaporative cooling (IEC) system at its facilities. IEC is a greener, more efficient, and cost-effective approach where water evaporation is used in place of mechanical systems to cool the air. A digital twin was

central in ensuring the optimal operational performance of this system, by predicting performance and offering recommendations for the different phases of implementation. This was in addition to highlighting operational risks – which could have resulted in outages – that may never have been discovered until after deployment.

For example, during simulations, the design team validated thermal performance under normal operating conditions. They consequently discovered that at full load occupancy, and with a prevailing south-westerly wind, humidity in the external air stream intakes would be raised, and the wet bulb temperature would exceed the appropriate limit. Armed with this information, they made design improvements ahead of construction, ensuring efficiency would remain high and operational risks would be kept to a minimum. Computational fluid dynamics (CFD) was also employed to guarantee the design would be able to manage a failure scenario, in particular, the IT equipment located by the failed cooling unit. Had these steps not been taken, the physical data center would have inevitably struggled to operate in normal conditions, and certainly when put under stress by high temperatures.

Retrofitting existing facilities to manage the challenges created by a changing climate is even more challenging than intelligently designing cooling systems in new centers, but thankfully digital twins can play a role here, too.

Imagine, for instance, a facility looking to upgrade its existing rooftop chillers. Historically, there had been challenges with reduced cooling from the chiller plant on hot days, due to recirculation of airflow in the rooftop plant. However, it faced additional complications because new planning regulations meant any change would also require the construction of sound reduction measures, adding an extra wall around the roof. The wall could be louvered but would still present an additional impediment to airflow. Roof space is fixed, with no possibility to change the building and other existing infrastructure. Resultantly, there would be no feasible placement options beyond the existing chiller position.

Given these constraints, the only option for mitigation of airflow issues would be the construction of various baffles and chimneys to physically segregate hot and cold air. A digital twin could be used to simulate various options and find a configuration that completely mitigates the airflow issues, but of course, there are further considerations. A solution that fully segregates the airflow may also significantly hamper regular maintenance and cause other operational issues day to day. But with the output from the digital twin, a cost-benefit analysis can be done and might show that partial mitigation would provide the best balance of risk and cost.

This is significant because it’s important to recognize that existing sites will be constrained by the infrastructure already in place, and changes may fall foul of new planning regulations, further complicating matters. Compromises will need to be made. Simulating the environment to understand potential risks, what the possibilities for mitigation are, and the cost of those mitigations will help businesses make the right decisions—notably on if, when, and how they will implement any changes to handle a hotter environment. In some cases, it may be the right decision not to take any mitigating steps. However, even in those cases, having the knowledge from the simulations of the impact of extreme events will help site teams plan when they are forecast.

Prepare now to avoid pain later

With cooling challenges in data centers on the rise, managers need to be prepared to handle the associated risks. Investing in technology now to understand and plan for the impacts of extreme weather is vital to preventing future outages in future warmer months and preparing managers for the challenges that will accompany a warming planet in the long term.

By Ben Pritchard, CEO, AVK.
By Carsten Ludwig, Market Manager DC, Reichle & De-Massari AG.
Exploring the impact of Critical National Infrastructure designation on data centres.
BY Ian Ferguson,  Director of Sales, EMEA.
By John Kreyling, Managing Director, Centiel UK.
By Stewart Laing, CEO, Asanti Data Centres.