Energy efficiency, staffing keep datacenter operators awake at night
Outages are declining, but when one does hit, it's expensive
While datacenter operators are under pressure to reduce energy consumption, reliability is gradually increasing and there have been fewer reported disruptive outages. Meanwhile, trust in AI as a tool for operational decision making has actually fallen.
Uptime Institute's Global DataCenter Survey 2023 shows that the bit barn industry continues to grow in importance and scale, but faces ongoing staffing and supply chain problems.
The survey covered a total of 879 respondents across multiple countries, with just over half located in North America and Europe.
Although this is the 13th annual survey of this kind, it is the first time that operators were asked to identify their key management concerns. Top among these is apparently improving energy performance of facilities, with 33 percent of respondents indicating they were very concerned about this. Finding enough qualified staff was the next big concern while improving energy efficiency of IT equipment came in third.
However, datacenter managers also report difficulty in forecasting demand and procuring the equipment they need to support any increase in usage - a trend that emerged during the pandemic and still lingers.
In energy efficiency, the bad news is that progress has largely stalled, at least in terms of the most commonly used metric, power usage effectiveness (PUE). The average annual PUE reported for datacenters has remained at about the 1.59 or 1.58 mark for four or five years.
This does not indicate an end to progress, but reflects the fact that all the easy gains have already been made, the report states. Further efficiency in many existing facilities would require major refurbishment work that would be costly and potentially disruptive.
Unsurprisingly, more modern facilities perform better, with 16 percent of this year's respondents reporting an average annual PUE below 1.3, mostly in Europe, the US and Canada.
The big question is whether the industry will be able cut PUE further in future, and this will depend on the funds invested in new technologies and how many older data dormitories are decommissioned or fully refurbished, the report states, but the shock of high electricity costs, particularly in Europe, may help to make the case for this.
Staffing has been a concern among datacenter operators for more than a decade, exacerbated by the growing demand for capacity. Uptime Institute reports that while this remains an issue, it may at least not be worsening at present, which it attributes to improvements in areas such as training and staff retention, but also possibly more use of automation.
However, the report warns that in markets such as Europe and North America, older staff make up a sizable portion of the workforce, and operators face the possibility that many may retire at about the same time, taking their knowledge of daily operations with them before they can bring junior staff up to speed.
Half of respondents said they had difficulty finding qualified candidates for roles, and 35 percent report having their staff poached by rivals, double the figure recorded five years ago.
The Uptime survey also reveals that the industry is still very much a boys' club, with women making up just 8 percent of datacenter teams, and a quarter of respondents said they have no women at all in those roles.
This is a lower representation even than physically demanding industries such as construction and mining, and the report suggests that operators struggling to find staff should take action to avoid overlooking women and other underrepresented candidates.
- How do you boost server efficiency? Buy new kit, keep it busy
- It's time for IT teams, vendors to prioritize efficiency; here's where they should start
- Europe wants more cities to use datacenter waste heating. How's that going?
- Cloud projects keep being postponed amid economic uncertainty
In terms of reliability, Uptime Institute reports a gradually improving situation regarding the frequency and severity of outages. For this report, 55 percent of operators said they had experienced an outage in the past three years, down from 60 percent in 2022 and 78 percent in 2020.
Another trend noted is that the impact of outages also appears to be declining in severity over time. Of respondents that reported an outage in the previous three years, 41 percent rated the impact as negligible, while a further 32 percent rated the impact as minimal. Only 4 percent reported a severe outage, and a further 6 percent a serious outage.
Historically, those last two categories have accounted for about 20 percent of outages, Uptime Institute said, and suggested a reason for the decline may be that more outages are being caused by partial failures of systems or equipment, rather than total failures.
But the cost of the outages that do occur remains high, with 54 percent of respondents disclosing that their most recent significant, serious or severe outage cost more than $100,000, and 16 percent said it cost them more than $1 million. Power failure was by far the most commonly reported cause of an outage.
Other notable findings from the report include that this year, on average, the percentage of an organization's IT workloads hosted in its own corporate bit barn fell below half, to 48 percent.
However, this does not mean that on-premises datacenter capacity, usage or expenditure is shrinking in absolute terms; instead it reflects the growing use of public clouds to host workloads.
The report states that 62 percent of respondents have increased budgets for their data facilities this year, while forecasts are for 15 percent of workloads to be hosted in the public cloud by 2025.
That figure may appear low to some, but Uptime Institute points out that many in its survey represent organizations that operate multiple bit barns and have many legacy applications, and so are likely to be slower to move to the cloud than companies with smaller IT estates.
One surprising finding from this year's survey (or perhaps not so surprising to Reg readers) is that nearly three-quarters of respondents believe that AI-based tools will eventually carry out some datacenter operations and put staff out of a job.
However, confidence that AI can be trusted to make operational decisions in the datacenter has actually taken a big knock this year – down 13 percent from when respondents were asked the same question last year.
The report suggests that the increased media coverage of AI has led to more awareness of the faults and limitations that exist within the current generation of AI-based models and any tools that may be based on them.
Even if AI models do become reliable enough to assume some datacenter roles, the likely short-term impact would be minimal, Uptime Institute concludes, as it will likely be taking up the slack of the worker shortage and reducing the need for additional hires, rather than replacing those already there. ®