Sponsored Artificial intelligence and machine learning hold out the promise of enabling businesses to work smarter and faster, by improving and streamlining operations or offering firms the chance to gain a competitive advantage over their rivals. But where is best to host such applications – in the cloud, or locally, at the edge?
Despite all the hype, it is early days for the technologies that we loosely label “AI”, and many organisations lack the expertise and resources to really take advantage of it. Machine learning and deep learning often require teams of experts, for example, as well as access to large data sets for training, and specialised infrastructure with a considerable amount of processing power. For this reason, many organisations experimenting with machine learning or deep learning often turn to the cloud for their development work.
This is because cloud service providers have a wealth of development tools and other resources readily available such as pre-trained deep neural networks for voice, text, image, and translation processing, according to Moor Insights & Strategy Senior Analyst Karl Freund.
Intel®, for example supports a number of popular AI frameworks, such as PyTorch and TensorFlow, available on the Intel® DevCloud, allowing developers to test and run workloads on a cluster of the latest Intel® hardware and software.
However, as Freund goes on to point out, while cloud services bring much of the click-and-go simplicity to AI development that customers have come to associate with cloud services, there will sometimes be a catch in that those applications may only be able to run on the cloud platform on which they were developed.
There is another downside to hosting machine learning and other data-intensive applications in the cloud, and that is latency. Although the major cloud providers each have a network of data centres located strategically around the globe, the nearest one to your location might still be hundreds or thousands of miles away, guaranteeing significant latency in data traffic. And latency is a real killer when it comes to applications that may require a real-time response. It isn’t just latency, however – depending upon how much data your application uses, there could be significant bandwidth costs in uploading it all across a wide area network link to a public cloud host or back to a massive centralised data centre.
Issues such as these are one reason for the rise of edge computing, which is easy to dismiss as just another IT industry buzzword but is actually grounded in the common sense notion that it is sometimes more desirable to perform the processing close to the point of action, where the data is being generated.
Take the operation of a facial recognition system at an airport, for example. You would want to do all the processing locally, to keep response times low, while most of the data is not needed afterwards, except perhaps for sending an activity log or telemetry back to the main data centre. This also highlights another issue, that of reliability. Using local compute for processing reduces the potential for unanticipated downtime due to network connection issues with the backhaul to the cloud or the main data centre.
Edge computing is a somewhat loose term, however, and can cover a wide range of deployment scenarios and actual hardware, from small embedded devices, perhaps based on an Intel Atom® processor, to racks full of Intel® Xeon® servers and other equipment that would not look out of place in a full-scale data centre. The latest generation of Xeon® Scalable processors, meanwhile, have Intel® Deep Learning Boost enhancements to increase AI/deep learning performance.
A good example of where edge computing fits in is in the infrastructure required to support 5G networks. The challenging requirements for next-generation mobile networks, including data rates of gigabits per second, ultra-low latency, a high level of reliability and support for a large number of simultaneously connected devices, mean that cell base stations are going to need a considerable amount of compute power in order to meet service demands.
In fact, meeting all those demands calls for extensive use of technologies such as software defined networking (SDN) and network function virtualisation (NFV), meaning that with 5G, cell base stations are effectively turning into mini data centres.
SDN, NFV and 5G are fertile ground for the Second Generation Intel® Xeon® Scalable processors, which already have the compute power and hardware features for operating multiple virtual appliances such as switching and routing functions, but now also have further optimisations. These include Intel® AVX-512 extensions that can accelerate physical layer signal processing in 5G networks, and Intel® QuickAssist Technology to boost security and data compression functions.
According to Intel®, the new second-generation chips delivers up to 1.58 times the performance for network workloads than the previous generation, allowing them to handle a greater density of virtual network functions (VNFs), while Inte®l’s open-source Data Plane Development Kit (DPDK) allows customers to optimise performance of SDN workloads such as Vector Packet Processing (VPP) on the same chips.
Factories are another good example of edge computing, as businesses move towards industrial digitalisation. Also somewhat grandly known as Industry 4.0, this is considered to be where analytics, AI and the industrial internet of things (IoT) all converge to drive better decision-making and boost productivity. For example, computer vision using Intel® Movidius® VPUs (visual processing units) could be used to monitor production lines analysing images of products to automatically detect any with potential defects.
Consequently, some industrial systems that have been developed in response by vendors including HPE, Dell EMC and Lenovo are basically micro data centres, fitting one or more racks full of servers, networking, storage and the required power and cooling infrastructure into a single enclosure that can be deployed somewhere on the factory site.
But one of the challenges of this model is that those edge data centres are going to be widely distributed geographically, and some may even be sited in locations where it is difficult or undesirable to keep sending out engineers to make changes or fix things. What this implies is a need for a high level of automation, so that such systems can run themselves with as little human intervention as possible, and even if there is a need for human intervention, the majority of management tasks should be capable of being performed remotely.
One answer to this is to use hyperconverged infrastructure (HCI), which is already widely deployed by many organisations because it offers a pre-integrated solution that is less complex to configure and manage. The basic premise of HCI is that each node is an appliance-like box that contains compute and storage, where the storage is pooled across a cluster of nodes to provide a software-defined storage layer that eliminates the need for separate storage in a SAN. The management layer that sits atop many HCI platforms allows for an admin to oversee and control systems from a single console, regardless of location. HCI is often used for remote office or branch office deployments for exactly this reason – the hardware can be configured and dispatched to the remote site, where it is simply plugged in and turned on.
Most HCI platforms also have a measure of redundancy for greater reliability, in that a deployment usually comprises multiple nodes, and the workload can be redistributed among the existing nodes in the event of a hardware failure.
There are other considerations for infrastructure to support data intensive workloads, of course. For one thing, accelerators such as Intel® FPGAs or more exotic hardware such as Intel® Nervana™ Neural Network Processor for Inference (Intel® Nervana™ NNP-I) may be required to deliver the desired level of performance for AI and ML workloads.
Keeping the accelerators fed with data is also a key requirement, which means having a high performance storage layer. This is best delivered through flash storage, almost certainly Intel® Optane™ DC SSDs using NVMe these days, as these provide significant I/O performance and reduced latency, even when compared to flash SSDs using a traditional SAS or SATA interface. However, network connectivity, servers and storage are all equally vital in delivering the required level of system performance when considering AI and ML workloads, as a lag in any single one of these aspects of the infrastructure will hinder performance.
There are some potential drawbacks to edge computing. Implementing an edge strategy could be expensive and complex, at least in the early stages because a significant upfront investment in equipment, resources and planning may be required.
Ensuring security may be another challenge with edge infrastructure. With hardware being deployed into remote or unsupervised locations, physical security may be more of a concern than with a large central data centre. For this reason, data encryption and secure access take on an extra importance in protecting edge computing systems.
As with any IT project, there is little sense investing in AI at the edge without having a specific business purpose in mind, and an assessment of the costs and benefits of choosing edge over other deployment models is a necessary first stage.
Fortunately, there is the Intel® AI Developer Program offering resources to help with the creation of AI projects from the data center to the edge.
Finally, there has been much talk of edge computing displacing or replacing cloud, as if somehow enterprise customers are going to repatriate all of the workloads they have farmed out to cloud providers such as AWS and Azure and redeploy them to edge data centres.
Instead, edge computing is just another way of delivering IT services, in the same way that cloud is, and it isn’t going to replace cloud any more than cloud has replaced on-premise IT. Expect edge computing to become just another part of the IT environment, used where an edge deployment model best fits the requirements of the application.
Sponsored by Intel ®