Automated cooling control in dynamic data centres

Fine tuning the operations of the cooling equipment needed to keep a busy and constantly changing data centre up and running is an onerous task, requiring concentration on safety and continuous computing availability while keeping costs under control. But Influence Maps and diligent analysis of temperature data can be used to design an optimal cooling plan and automate the ongoing cooling effort for optimal efficiency. By Henrik Leerburg, Product Line Director, Schneider Electric for StruxureWare for Data Centers.

  • 9 years ago Posted in

MAINTAINING IT infrastructure at safe operating temperatures while minimising the cost and energy wastage of the cooling equipment needed to do so is a delicate balancing act for data centre operators.

A crude but simple approach is to cool the ambient temperature of the room as a whole. However, outside of particularly cold climate regions that becomes a prohibitively expensive proposition. Far more attractive is to allow the ambient temperature of a computer room or data centre facility to be as warm as possible while directing cooling effort with pinpoint accuracy to the IT equipment where it is most needed. Depending on the occupancy and utilisation of the servers and storage arrays in the data centre’s racks, the cooling load will vary continuously. Deciding how much cooling effort to deploy to individual racks is a process that needs constant monitoring and adjustment.

Frequently this has to be done manually, with operators assessing the temperature of equipment at points within the data centre and adjusting the local computer room air conditioner (CRAC) units accordingly. This is far from ideal as it represents a very tight process window, placing a heavy burden
on operating personnel. It also introduces
the risk of inaccurate temperature measurement which can lead to inefficient, or worse, inadequate utilisation of cooling resources.
Part of the problem is assessing the overall impact of adjusting the output of a particular CRAC unit. Upping the cooling output of one unit may have an effect, positive or negative, on the temperature of a rack some distance away thanks to the flow of air throughout the cooling system infrastructure. This necessitates further adjustments whose effects must also be taken into account.
A more proactive method of maintaining adequate cooling at maximum efficiency is to use Influence Maps: schematic tools that allow operators to model the effects throughout a data centre of adjustments made to one or more CRAC units.
An Influence Map is unique to each computer room, containing within it data specific to its layout and contents. This data is captured automatically by software, which at the point of first installation, “learns” the specifics of the site in which it is deployed, as CRAC units are turned on and off and the effects of these actions are measured. In this way a baseline map of the cooling processes in a computer room is drawn up.
These maps can then be used as the basis of an automated cooling plan, providing operators with an accurate picture of the effects of adjusting the outputs of CRAC units so that optimal cooling can be realised. Of course, no data centre operates in a steady state.

Changes in load are continuous with more racks of servers, storage arrays and communications equipment being added and older ones replaced. Each rack will have different load and cooling characteristics and the automated cooling plans must be adjusted to take account of these. Adjusting the baseline of an Influence Map is therefore a regular, typically a weekly, process.
Schneider Electric’s Data Center Operation: Cooling Optimize module is an add-on for the company’s existing StruxureWare for Data Centers Data Center Infrastructure Management (DCIM) software. The module makes use of Intelligent Analytics technology to gather data in real time and produce the necessary Influence Maps.
These in turn allow the Cooling Optimize module to react to this data in a closed-loop system, automatically identifying and eliminating hot spots and helping to diagnose potential overheating risks.
Cooling Optimize makes use of a dense array of temperature sensors to determine exactly where the heat load is within a data centre. This temperature data is aggregated and transmitted wirelessly to a purpose-built appliance where it is analysed by control software, which then sends adjustment commands to the cooling equipment. As the server and storage load changes, the built-in machine learning automatically adjusts cooling output to match the requirements of the data centre.
The module balances the need for cooling with the lowest possible energy expenditure, allowing the ambient temperature of the facility to rise while delivering sufficient cooling to where it is needed, resulting in immediate cost savings.
The intelligent control provided by Cooling Optimize improves the manageability of a data centre with its automatic closed-loop adjustments making the necessary changes in real time and the influence maps providing greater insight to operators of the systemic effects of those adjustments.

Thermal airflow is constantly adapted to match the cooling needs created by the data centre’s changing characteristics. The continuously optimised cooling capacity allows operators to increase both the load and capacity of IT equipment in their data centres, confident that the cooling infrastructure deployed is sufficient to keep the facility operating. The constant collection and aggregation of temperature data helps mitigate risks to the safe operation of the IT equipment in a data centre. Up to 95% of hotspots are automatically resolved and data is also provided to help operators diagnose trickier issues.

In the event of an emergency, cooling units will automatically run at maximum capacity and thereby ensure a cool facility until such time as the issue can be resolved and the data centre can return to optimal efficiency. This safety feature applies even if the Cooling Optimize module is unable to connect to or control the cooling units in question.
Providing optimal control of cooling equipment, tailored dynamically to the changing load, can dramatically reduce the running costs of a data centre.

As much as 40% of cooling energy costs can be eliminated by directing cooling effort only where it is needed and removing redundancy. Further cost savings in the
long term are delivered by efficient use of
cooling equipment. The more optimally
it is used, the less unnecessary wear
and tear is experienced and therefore the
equipment requires less maintenance yet lasts longer.
The Cooling Optimize Module provides operators with rich and detailed reports of the operations of the cooling equipment, allowing greater fine tuning and long-term planning based on the insights delivered.

A Benchmark Report verifies energy and cost savings as well as greenhouse-gas reductions achieved through the use of active cooling control. A Temperature Compliance report determines whether rack temperatures have complied with required set points, and if they have not, it also records the time period for which racks have been in violation of these limits. This allows easy detection of potential problems within a data centre so that they can be addressed before they become critical.
Conclusion
Making use of the data-collection and analysis capabilities of the Cooling Optimize module, coupled with its intelligent use of Influence Maps allows data centre operators to fine tune their operations to deliver sufficient cooling while avoiding the twin perils of over provisioning—and therefore increased cost—and overheating.