This article is more than 1 year old

It's time for IT teams, vendors to prioritize efficiency; here's where they should start

Hint: Greener gear is only the half of it

Servers and storage account for the lion's share of datacenter power consumption, yet while bit barns put increasing pressure on local utilities and try the patience of local residents and regulators, the vendors responsible for these systems have shown little interest in doing anything about it.

"In the face of limited power availability in key datacenter markets, together with high power prices and mounting pressure to meet sustainability legislation, enterprise IT's energy footprint will have to be addressed more seriously," Uptime Institute analyst Daniel Bizo wrote in a recent blog post. "This will involve efficiency improvement measures aimed at using dramatically fewer server and storage systems for the same workload."

However, the problem isn't just that datacenters are full of old, inefficient gear that slurp up power in exchange for too little work. Instead, the problem is more nuanced than that. These facilities are incredibly complex, requiring operators to carefully balance compute resources against their power and thermal requirements.

This complexity, Bizo contends, often results in poor capacity planning, where more power is provisioned than is actually needed. This, he adds, results in operators building new datacenters while existing capacity is left untapped and even greater pressure placed on local grids.

While much of the conversation around datacenter sustainability has centered around facility-wide efficiency gains — for example Equinix's decision to run its datacenters hotter, or others embracing direct liquid cooling (DLC) — the largest contributor to DC power consumption has remained conspicuously silent: the IT vendors, the analyst outfit found.

However, Uptime thinks that's about to change and cites four key drivers that will force IT vendors to go green. This includes blowback from local municipalities, grid limitations and restrictions enforced by local utilities, a greater emphasis on sustainability regulation and reporting, and soaring energy prices, particularly in Europe and the UK.

Some of this is already taking place, particularly in power-constrained regions. Loudoun County in Northern Virginia is just one example. The state has seen a flurry of new regulations mandating stricter sustainability standards, noise mitigations, and limiting grid connections. And Uptime has tracked similar trends in Connecticut, Singapore, Ireland, The Netherlands, and Germany.

Hardware optimizations

So what can be done to bolster IT infrastructure efficiency? According to Uptime analyst and former program manager for IBM's Energy and Climate Stewardship program, Jay Dietrich, there is no shortage of options.

Looking at servers and storage infrastructure, one of the obvious opportunities for vendors is optimizing firmware to improve the efficiency of their products. Even today, Dietrich says there is often a sizable gap in efficiency — as measured by a system's server efficiency rating tool (SERT) score — between various vendors, despite using the same components.

"Some [vendors] clearly do things in their firmware and how they put their equipment together that delivers you a more efficient server," he said.

Efficiencies can also be had from simply cramming more workloads into fewer servers, and several chipmakers, including both AMD and Intel, are aggressively pursuing this.

AMD offered one of the clearest pictures of this during its Genoa launch even last November, when it touted how just five, dual-socket Epyc 4 systems, each with 192 cores, could take the place of 15 Intel Ice Lake systems, while consuming half the power in a virtualization workload.

While more compute dense CPUs and accelerators may allow for greater consolidation and smaller datacenter footprints, they aren't without their headaches, as Uptime noted in a January report. Two of the big ones are thermal management and power delivery infrastructure.

One way to contend with this is to start migrating to liquid and immersion-cooled infrastructure. And in addition to being better than air at capturing and removing heat from a chassis, it also largely eliminates the need for power hungry high-pressure fans.

As we've previously reported, upwards of 15 percent of a modern system's power draw can be attributed to the fans used to move air through the chassis. And with dual socket systems easily exceeding one kilowatt of power draw under load, that could be as much as 150W just to spin the fans.

By embracing DLC, that could be reduced considerably depending on the specific implementation. Some designs use so much copper that fans can be eliminated outright, while others will still require some air movement. And in the case of immersion cooling, no fans are required at all.

IT teams need to prioritize efficiency

However, in Dietrich's eyes, the biggest opportunity is one of priorities.

"The IT team has to approach optimizing efficiency of their infrastructure with the same intensity that they bring to optimizing performance," he said. "I don't think historically that the energy efficiency side of the equations in the IT infrastructure has been an important metric for the team."

According to Dietrich, while AMD's three-to-one consolidation example might sound compelling, it's not as simple as it might sound. The workloads themselves need to be taken into consideration, otherwise much of the server could be left sitting idle, he explained.

The issue is that a chip like AMD's Genoa doesn't just offer more cores, those cores are also faster. That, combined with the fact that most workloads tend to have an operating window where their utilization is at their highest, there's an opportunity to schedule them to maximize system utilization.

"It's just like a Tetras game where all the pieces have to fit together," Dietrich said. "A system administrator could say, well, I can take these five applications and put them on the server, and they'll fit together pretty well."

For example, if you've got a workload that has its highest utilization during the workday from 9 a.m. to 5 p.m., you might pair it with another workload that runs in the evening, and another that runs overnight in order to maximize efficiency.

The idea is by doing this, IT teams could potentially achieve even greater degrees of consolidation while also keeping the machine in its peak power efficiency range. We're told that for many modern chips, this is somewhere between 60-80 percent utilization.

While this can be done manually, there's a better approach, Dietrich argues, and you can imagine that involves the use of software tools to automatically identify combinations of workloads likely to maximize utilization.

Dietrich notes this process should probably be done incrementally to ensure stability and uptime of critical services. ®

More about


Send us news

Other stories you might like