Intel, AMD just created a headache for datacenters
Server silos didn't see today's watt-gobbling, space-heater chips coming
In pursuit of ever-higher compute density, chipmakers are juicing their chips with more and more power, and according to the Uptime Institute, this could spell trouble for many legacy datacenters ill equipped to handle new, higher wattage systems.
AMD's Epyc 4 Genoa server processors announced late last year, and Intel's long-awaited fourth-gen Xeon Scalable silicon released earlier this month, are the duo's most powerful and power-hungry chips to date, sucking down 400W and 350W respectively, at least at the upper end of the product stack.
The higher TDP arrives in lock step with higher core counts and clock speeds than previous CPU cores from either vendor. It's now possible to cram more than 192 x64 cores into your typical 2U dual socket system, something that just five years ago would have required at least three nodes.
However, as Uptime noted, many legacy datacenters were not designed to accommodate systems this power dense. A single dual-socket system from either vendor can easily exceed a kilowatt, and depending on the kinds of accelerators being deployed in these systems, boxen can consume well in excess of that figure.
The rapid trend towards hotter, more power dense systems upends decades-old assumptions about datacenter capacity planning, according to Uptime, which added: "This trend will soon reach a point when it starts to destabilize existing facility design assumptions."
This trend will soon reach a point when it starts to destabilize existing facility design assumptions
A typical rack remains under 10kW of design capacity, the analysts note. But with modern systems trending toward higher compute density and by extension power density, that's no longer adequate.
While Uptime notes that for new builds, datacenter operators can optimize for higher rack power densities, they still need to account for 10 to 15 years of headroom. As a result, datacenter operators must speculate as the long-term power and cooling demands which invites the risk of under or over building.
With that said, Uptime estimates that within a few years a quarter rack will reach 10kW of consumption. That works out to approximately 1kW per rack unit for a standard 42U rack.
Powering these systems isn't the only challenge facing datacenter operators. All computers are essentially space heaters that convert electricity into computational work with the byproduct being thermal energy.
According to Uptime, high-performance computing applications offer a glimpse of the thermal challenges to come for more mainstream parts. One of the bigger challenges being substantially lower case temperatures compared to prior generations. These have fallen from 80C to 82C just a few years ago to as low as 55C for a growing number of models.
"This is a key problem: removing greater volumes of lower-temperature heat is thermodynamically challenging," the analysts wrote. "Many 'legacy' facilities are limited in their ability to supply the necessary airflow to cool high-density IT."
To mitigate this, the American Society of Heating, Refrigerating and Air-Conditioning Engineers (ASHRAE) have issued revised operating recommendations [PDF] for datacenters including provisions for dedicated low-temperature areas.
- China's drive for efficient datacenters has made liquid cooling mainstream
- As liquid cooling takes off in the datacenter, fortune favors the brave
- Draft climate law threatens fines for datacenters that don't cut their carbon count
- Is a lack of standards holding immersion cooling back?
Liquid cooling has also gained considerable attention as chips have grown ever hotter. During the Supercomputing Conference last year we took a deeper dive at the various technologies available to cool emerging systems.
But while these technologies have matured in recent years, Uptime notes they still suffer from a general lack of standardization "raising fears of vendor lock in and supply chain constraints for key parts as well as reduced choice in server configurations."
Efforts to remedy these challenges have been underway for years. Both Intel and the Open Compute Project are both working on liquid and immersion cooling reference designs to improve compatibility across vendors.
Early last year Intel announced a $700 million "mega lab" which would oversee the development of immersion and liquid cooling standards. Meanwhile, OCP's advanced cooling solutions sub project, has been working on this problem since 2018.
Despite these challenges, Uptime notes that the flux in datacenter technologies also opens doors for operators to get a leg up on their competition, if they're willing to take the risk.
Power is getting more expensive
And there may be good reason to do just that, according to Uptime's research, which shows that energy prices are expected to continue their upward trajectory over the next few years.
"Power prices were on an upward trajectory before Russias' invasion of Ukraine. Wholesale forward prices for electricity were already shutting up — in both the European and US markets — in 2021," Uptime noted.
While not directly addressed in the institute's report, it's no secret that direct liquid cooling and immersion cooling can achieve considerably lower power usage effectiveness (PUE) compared to air cooling. The metric describes how much of the power used by datacenters goes toward compute, storage, or networking equipment. The closer the PUE is to 1.0, the more efficient the facility.
Immersion cooling has among the lowest PUE ratings of any thermal management regime. Vendors like Submer often claim efficiency ratings as low as 1.03.
Every watt saved by IT reduces pressures elsewhere
The cost of electricity isn't the only concern facing datacenter operators, Uptime analysts noted. They also face regulatory and environmental hurdles from municipalities concerned about the space and power consumption of neighboring datacenter operations.
The European Commission is expected to adopt new regulations under the Energy Efficiency Directive which, Uptime says will force datacenters to reduce both energy consumption and carbon emissions. Similar regulation has been floated stateside. Most recently a bill was introduced in the Oregon assembly that would require datacenters and cryptocoin mining operations to curb carbon emissions or face fines.
Uptime expects the opportunities for efficiency gains to become more evident as these regulations force regular reporting of power consumption and carbon emissions.
"Every watt saved by IT reduces pressures elsewhere," the analysts wrote. "Reporting requirements will sooner or later shed light on the vast potential for greater energy efficiency currently hidden in IT." ®