Sponsored If you take your data centre infrastructure seriously, you’ll have taken pains to construct a balanced architecture of compute, memory and storage precisely tuned to the needs of your most important applications.
You’ll have balanced the processing power per core with the appropriate amount of memory, and ensured that both are fully utilised by doing all you can to can get data off your storage subsystems and to the CPU as quickly as possible.
Of course, you’ll have made compromises. Although the proliferation of cores in today’s processors puts an absurd amount of compute power at your disposal, DRAM is expensive, and can only scale so far. Likewise, in recent years you’ll have juiced up your storage with SSDs, possibly going all flash, but there are always going to be bottlenecks en route to those hungry processors. You might have stretched to some NVMe SSDs to get data into compute quicker, but even when we’re pushing against the laws of physics, we are still constrained by the laws of budgets. This is how it’s been for over half a century.
So, if someone told you that there was a technology that could offer the benefits of DRAM, but with persistence, and which was also cheaper than current options, your first response might be a quizzical, even sceptical, “really”. Then you might lean in, and ask “really?”
That is the promise of Intel® Optane™, which can act as memory or as storage, potentially offering massive price performance boosts on both scores. And drastically improve the utilisation of those screamingly fast, and expensive, CPUs.
So, what is Optane™? And where does it fit into your corporate architecture?
Intel describes Optane™ as persistent memory, offering non-volatile high capacity with low latency at near DRAM performance. It’s based on the 3D XPoint™ technology developed by Intel and Micron Technology. It is byte and bit addressable, like DRAM. At the same time, it offers a non-volatile storage medium without the latency and endurance issues associated with regular flash. So, the same media is available in both SSDs, for use as storage on the NVMe bus, and as DIMMs for use as memory, with up to 512GB per module, double that of current conventional memory.
It’s also important to understand what Intel means when it talks about the Optane™ Technology platform. This encompasses both forms of Optane™ – memory and storage - together with the Intel® advanced memory controller and interface hardware and software IP. This opens up the possibility not just of speeding up hardware operations, but of optimising your software to make the most efficient use of the hardware benefits.
So where will Optane™ help you? Let’s assume that the raw compute issue is covered, given that today’s data centre is running CPUs with multiple cores. The problem is more about ensuring those cores are fully utilised. Invariably they are not, simply because the system cannot get data to them fast enough.
DRAM has not advanced at the same rate as processor technology, as Alex Segeda, Intel’s EMEA business development manager for memory and storage, explains, both in terms of capacity growth and in providing persistency. The semiconductor industry has pretty much exhausted every avenue available when it comes to improving price per GB. When it comes to the massive memory pools needed in powerful systems, he explains, “It’s pretty obvious that DRAM becomes the biggest contributor to the cost of the hardware…in the average server it’s already the biggest single component.”
Meanwhile, flash - specifically NAND - has become the default storage technology in enterprise servers, and manufacturers have tried everything they can to make it cheaper, denser and more affordable. Segeda compares today’s SSDs to tower blocks - great for storing something, whether data or people, but problems arise when you need to get a lot of whatever you’re storing in or out at the same time. While the cost of flash has gone down, endurance and performance, especially on write operations, means “it's not fit for the purpose of solving the challenge of having a very fast, persistent storage layer”.
Moreover, Segeda maintains, many people are not actually aware of these issues. “They’re buying SSDs, often SAS SSDs, and they think it is fast enough. It's not. You are most likely not utilising your hardware to the full potential. You paid a few thousand dollars for your compute, and you’re just not feeding it with data.”
To highlight where those chokepoints are in typical enterprise workloads, Intel has produced a number of worked examples. For example, when a 375GB Optane™ SSD DC P4800X is substituted for a 2TB Intel® SSD DC P4500 as the storage tier for a MySQL installation running 80 virtual cores, CPU utilisation jumps from 20 per cent to 70 per cent, while transaction throughput per second is tripled, and latency drops from over 120ms to around 20ms.
This latency reduction, says Segeda, “is what matters if you’re doing things like ecommerce, high frequency trading.”
The same happens when running virtual machines, using Optane™ in the caching tier for the disk groups in a VMware vSAN cluster, says Segeda. “We're getting half of the latency and we're getting double the IO from storage. It means I can have more virtual machines accessing my storage at the same time. Right on the same hardware. Or maybe I can have less nodes in my cluster, just to deliver the same performance.”
A third example uses Intel® Optane™ DC Persistent memory as a system memory extension in a Redis installation. The demo compares a machine with total available memory of 1.5TB of DRAM and a machine using 192GB of DRAM and 1.5TB of DCPMM. The latter delivered the same degree of CPU utilization, with up to 90 per cent of the throughput efficiency of the DRAM only server.
These improvements hold out the prospect of cramming more virtual machines or containers on the same server, says Segeda, or keeping more data closer to the PC, to allow real time analytics. This is important because while modern applications generate more and more data, only a “small, small fraction” is currently meaningfully analysed, says Segeda. “If you're not able to do that, and get that insight, what's the point of capturing the data? For compliance?” Clearly, compliance is important but it doesn’t help companies monetise the data they’re generating or giving them an edge over rivals.
The prospect of opening up storage and memory bottlenecks will obviously appeal, whether your infrastructure is already straining, or because while things are ticking over right this minute, you know that memory and storage demands are only likely to go in one direction in future. So, how do you work out how and where Optane™ will deliver the most real benefit for your own infrastructure?
On a practical level, the first step is to identify where the problems are. Depending on your team’s engineering expertise, this could be something you can do inhouse, using your existing monitoring tools. Intel® also provides a utility called Storage Performance Snapshot to run traces on your infrastructure and visualise the data to highlight where data flow is being choked off.
Either way, you’ll want to ask yourself some fundamental questions, says Segeda: “What's your network bandwidth? Is it holding you back? What's your storage workload? What's your CPU utilisation? Is the CPU waiting for storage? Is the CPU waiting for network? [Then] you can start making very meaningful assumptions.” This should give you an indication of whether expanding the memory pool, or accelerating your storage, or both will help.
As for practical next steps, Segeda suggests talking through options with your hardware suppliers, and Intel account manager if you have one, to take a holistic view of the problem.
Simply retrofitting your existing systems can be an option he says. Add in an Optane™ SSD on NVMe, and you have a very fast storage device. Optane™ memory can be added to the general memory pool, giving memory expansion at a relatively lower cost.
However, Segeda says, “You can have a better outcome if you do some reengineering, and explicit optimization.”
Using Optane™ as persistent memory requires significant modification to the memory controller, something that is currently offered in the Intel® Second Generation Xeon® Scalable Gold or Platinum Processors. This will enable the use of App Direct Mode, which allows suitably modified applications to be aware of memory persistence. So, for example Segeda explains, this will allow an in memory database like SAP Hana to exploit the persistence, meaning it does not have to constantly reload data.
Clearly, an all-new installation raises the option of a more efficient setup, with software optimised to take full advantage of the infrastructure, and with fewer but more compute powerful nodes. All of which gives which the potential to save not just on DRAM and storage, but on electricity, real estate, and also on software licenses.
For years, infrastructure and software engineers and data centre architects have had to delicately balance computer, storage, memory, and network. With vast pools of persistent memory and faster storage now in reach, at lower cost, that juggling act may just be about to get much, much easier.
Sponsored by Intel®