How automation can drive out downtime

Nokia Event-Driven Automation platform is on a mission to remove mistakes from datacenter operations

Sponsored feature For decades, the term "human error" has been a byword for accidents and adverse outcomes in safety-relevant industries like transport and industrial processing. But more recently it’s also been appropriated for risk-critical operations in the IT sector.

In high-stakes datacenter environments, human error is now cited as a major cause of outages. The 2023 Datacenter Resiliency Survey from certification body the Uptime Institute found that, between 2020 and 2023, about 39 percent of respondents were hit by a major outage in which human error played a role. 47 percent of those polled reckoned their errors were caused because of datacenter personnel "failing to follow" correct procedure.

Whether those failures had been sufficiently well trained in correct procedure fell outside of the scope of the poll. However, it’s a question that points the way to more revealing insights relevant to the human error challenge faced by datacenter owners.

A means for minimizing or even eradicating human error will evidentially give datacenter operators competitive advantage as they automate to cope with the rigors AI applications and workloads are already bringing. And AI itself has a key role to play in datacenter automation.

Researchers at the Uptime Institute say they view human error as but one causal factor in datacenter downtime. Rarely are errors made by a human (or humans) the sole cause of datacenter system disruption. That aside, they estimate that historically, human error still "plays a role" in 66 percent-to-80 percent of datacenter outages. By any measure that’s significant enough for the probability of human error to be factored into any IT automation strategy as a potential initiator of unforeseen disruption that an automated management solution must stand ready and able to resolve.

Another reason for tackling the human error challenge is that, with the right analytical tools, it can point the way to deeper potential malfunctions within complex datacenter architectures. Such tools give systems developers the visibility they need to find fixes for hidden potential problems that might otherwise have been attributed solely to human error.

Automation acceptance

According to a report by ResearchAndMarkets, automation now represents a significant "transformative force" within the global IT sector. It valued the market for datacenter automation solutions at $12.8 billion in 2023, and projects it to reach $40.5 billion by 2030 (that’s a rosy CAGR of 17.8 percent over the forecast period).

Such growth indicates that datacenter operators are turning to automation because they have few – if any – other coping strategies open to them. Recruiting new team members can be arduous and frustrating. The workforce talent they need to meet escalating demand often just isn’t available.

In brief, datacenter automation governs the processes by which workflows and processes – routine scheduling, monitoring, maintenance, application delivery, network management, etc – are managed and executed in an automated regime (applying controls and processes in accordance with policies and standards), all without human administration (although humans can intervene if necessary).

From an error-reduction perspective it’s about leveraging technology to systematize tasks that are repetitive, routine, and likely to introduce mistakes. Distancing the error-prone human element from routine facilities operations serves another key benefit in addition to reducing the scope for inadvertent blunders. It can create opportunities for data centre practitioners to move up into roles where their expertise adds more strategic value to the business mission - performance fine-tuning, growth planning, and implementing new technologies more rapidly, for example. With humdrum tasks eliminated by automation, seasoned data centre staff may also gain the time and space to share their knowledge with less-experienced team members to help ensure operational continuity.

Clearly, with market dynamics moving so fast, the basic principles on which traditional datacenter automation have been based need to be rethought for the next-generation infrastructure build-outs that are rapidly becoming the new industry norms.

Driven by events

This is why Nokia Event-Driven Automation – EDA – is attracting the attention of datacenter operators. Launched in 2024, EDA – designed to be scalable from large AI fabrics to small edge clouds – is available through on-premises and cloud-based ‘as-a-service’ subscription models, and is a key component of the Nokia Data Center Fabric solution. Built on the Kubernetes open-source container orchestration platform that automates the deployment, scaling, and management of containerised applications, EDA also utilises its extensive open-source ecosystem.

The Nokia EDA concept is an approach to IT systems management focused on triggering actions based on events, rather than regularly polling the infrastructure to check what’s going on, or relying on notifications of scheduled tasks. When an event related to the configuration or state of the infrastructure occurs – an alert, network change, telemetry update or end-user action, for instance – the EDA platform automatically executes operational logic.

Within EDA, controllers continually reconcile managed elements based on external or internal events in the system with the help of pluggable automation applications. These applications define their own resources and define the logic these resources enact when created, modified or deleted.

Human error risks are mitigated through EDA’s integrated digital twin, pre- and post-deployment checkpoints, multi-dimensional observability, and a robust CI/CD (Continuous Integration/Continuous Delivery/Deployment) methodology with revision control.

A platform for software development

Essentially a software development methodology, CI/CD automates the building, testing and deployment of configuration changes, aiming for faster and more reliable software releases. It involves two main practices: Continuous Integration, where code changes are frequently integrated and tested; and Continuous Delivery/Deployment, where the code is prepared for deployment and (optionally) deployed automatically.

Operational simplicity in Nokia EDA is enabled through intent-based declarative automation, Generative AI assistance and a low-code/no-code approach to building customized dashboards. Another feature, Zero Touch Provisioning, turns device onboarding into a plug-in-and-power-up process, with EDA providing topology discovery and a bootstrap process to bring new nodes online.

EDA integrates into multi-vendor, multi-domain environments with support for a range of IT service management systems, event notification systems, and cloud management platforms.

Going beyond basic customisations, EDA also provides a fully-featured platform for application development. Apps built on EDA leverage a generic intent framework that allows the building of event-driven applications to implement intents. Apps are also straightforward to write, Nokia says, and can be iterated on using a CI/CD workflow in production.

As a key component of the Nokia Data Center Fabric solution, EDA complements and extends the vendor’s Service Router Linux (SR Linux) network operating system, which features a complete set of programmatic and telemetry interfaces.

While EDA is vendor-agnostic, it provides insight into how the network is operating when it is combined with SR Linux. As a cloud-native platform, EDA runs on Kubernetes. With EDA, you can leverage the Kubernetes ecosystem and API operations to manage and consume the network. EDA also augments Kubernetes to make network automation more predictable, avoid eventual consistency and ensure that the network is not left in indeterministic state.

With EDA, Nokia has extended Kubernetes to the network, leveraging a cloud-based microservices architecture and applying key Kubernetes concepts (intent-based, declarative, event-driven and revision control) that heighten network orchestration and automation. Nokia also leverages Kubernetes tooling and APIs to further streamline EDA operation and integration.

Why automation? And why now?

It’s taken time for the automation proposition to win over datacenter operators, but survey research by market-watcher IDC indicates that almost half of IT organisations polled are ready and willing to have the network take direct management action prompted by comprehensive network intelligence and insights. So why now?

“Network data collection has expanded across a full spectrum of sources,” explains Mark Leary, Research Director at IDC. “Insights from all that data [have been] heightened, bolstered by AI/Machine Learning-driven analytics engines. This combination of detailed intelligence and deep insights has instilled [datacenter operators’] confidence in the resulting automated actions.”

Leary adds: “Factor-in the growing shortages and pressures associated with [staffing], and enterprises are further influenced to let the [IT] play an active role in managing itself.”

Meanwhile, Nokia is already deploying EDA to enable service providers to build data centre networks that can meet the demands of AI workloads. In a partnership with Maxis, Malaysia’s telecommunications provider, Nokia will deploy its 7220 Interconnect Router (IXR) data centre switches (another key component of the Nokia Data Center Fabric solution) and EDA technology across Maxis data centres. Nokia’s switches using the SR Linux network operating system are designed to enhance network reliability through a ‘quality-first’ approach. This also leverages EDA for flexibility and scalability.

“With ‘human error zero’ as the goal, our EDA customers start with defining what a data centre looks like, and then EDA translates that intent into underlying configuration,” concludes Michael Bushong, Vice President of Data Center at Nokia. “As it turns out, when you deliver reliable operations, the natural by-products are speed and efficiency.”

Sponsored by Nokia.

More about

More about

More about

TIP US OFF

Send us news