How OpenAI Is fuelling the liquid cooling boom

As OpenAI’s advanced models crank up the heat on data centres, Vivek Swaminathan, Director of Products and Solutions, Digital Workplace Solutions at Unisys, explains why liquid cooling could be the key to unlocking AI’s full potential.

The latest AI models from Open AI are expanding technological possibilities, making AI more affordable and accessible. However, to fully democratise this technology and capitalise on the AI boom, data centre infrastructure must be upgraded. 

According to the International Energy Agency, data centres worldwide accounted for about 1-1.3% of global electricity usage in 2023. As AI is integrated more into everyday work and life, this is expected to increase by 50% by 2027 and 165% by 2030. This drastic increase would put immense pressure on existing data centre infrastructure, which is tasked with maintaining power levels and not overheating organisations’ systems.

Most data centres currently use traditional air-cooling systems, but with rising cooling demands, liquid cooling has emerged as a more efficient and sustainable solution, poised to become the premier choice for data centres worldwide.

The strain AI can put on data centres

AI training and inferencing are two critical phases of machine learning technology, enabling AI to learn and make predictions. However, this technology uses a lot of energy. As data centres become more engaged with AI processes, this will place growing demands on graphic processing units (GPUs), which perform complex computations.

For example, AI training requires massive parallel processing to analyse datasets and adjust billions of parameters. Training a single model can take weeks of intense, sustained GPU utilisation, putting the system at risk of overheating. Additionally, inferencing, or generating responses for the user, applies trained models to real-world data. This process is less computationally demanding than training, but it relies on GPUs for low-latency tasks like autonomous driving or medical imaging.

This increase in energy usage can result in data centres consistently exceeding the recommended operating temperature range, which can lead to unsustainable wear and tear. This can lead to costly maintenance and shorten a GPU’s lifespan, limiting data centres’ effectiveness.

The evolution of cooling technology

Due to the technology’s computational demands, advancements in AI are directly driving innovations for data centre cooling systems. Data centres that use air cooling spend about 40% of their energy on cooling alone. Furthermore, air cooling systems take up more space, reducing the data centre’s storage potential. Liquid cooling offers a more efficient solution, with a 15% increase in Total Usage Effectiveness and a 10% decrease in total data centre power usage. 

Liquid cooling can also dissipate more heat and better maintain the efficiency of high-density computing systems compared to air cooling. Furthermore, data centre operators can also focus cooling efforts on specific CPUs (central processing units) and GPUs, allowing them to optimise thermal management and minimise the space required for cooling equipment.

However, despite its benefits, liquid cooling is not a simple solution. The technology comes with its challenges, primarily around maintenance. Unlike conventional air-cooling systems, liquid cooling systems require complex plumbing and careful coolant management, and maintaining these intricate components is crucial to ensure optimal performance.

Imagine you have a toy that runs on batteries and gets very hot when used for a long time. It might stop working or break if you don’t cool it down. Liquid cooling is like giving that toy a special drink that helps keep it cool while in use. However, the toy can overheat again if that drink leaks or gets dirty.

Those who invest in liquid cooling and the maintenance required to keep it operational will benefit, as each dollar spent on liquid cooling hardware typically incurs an annual upkeep cost of $0.30 to $0.50.

The bottom line

AI’s hunger for computational power isn’t slowing down. As OpenAI and others push GPU boundaries, liquid cooling isn’t just an option; it’s the backbone of long-lasting AI growth. 

Advanced technology not only decreases energy demand but also extends the lifespan of GPUs and increases processing speeds, ultimately enabling even more potential innovation down the line. In the AI-driven future, staying cool isn’t just an advantage. It’s survival.

Categories

Related Articles

Top Stories