Wow, still using disk and PCIe storage? You look like a flash-on victim, darling – it isn't 2014

Server slingers, like models, it's OK to be DIMM


Comment For generations of PowerEdge, ProLiant, UCS and other x86 servers, the future has been fairly simple: more powerful multi-core processors, more memory, more PCIe bandwidth and shrink the space and electricity needs.

For example, a Gen8 ProLiant DL3603 server had 1 or 2 Xeon E5-2400/2400 v2, with 2/4/6/8/10 cores, and 12 x DDR3 DIMMs, up to 1600 MHz (384GB max). The replacement Gen9 ProLiant DL160 server uses 1 or 2 Xeon E5-2600 v3 Series, with 4/6/8/10/12 cores, and is fitted with 16 x DDR4 DIMMs, up to 2133 MHz (512GB max).

The hardware got better but the application and operating software didn’t have to change much at all, except down at detailed driver hardware interface levels, as Gen 8 ProLiant gave way to Gen 9 ones. Application code read data from persistent storage into memory, got the CPU cores to chew on it, and wrote the results back to persistent storage. Rinse, repeat, job done.

That is about to change.

The massive collective power of multi-core CPUs and virtualised server software means that apps are spending proportionately more time waiting for IO from persistent storage. There is also an increasing need for servers to work faster on larger chunks of data, with a desire to avoid the latency-intensive IOs to persistent storage.

The finger of latency blame is pointing at three places, and declaring:

  1. Disk is too slow for random data IO.
  2. Flash, though faster than disk, is still too slow for IO.
  3. The disk-based IO stack in an OS takes too much time and is redundant.

You can bring storage media “closer” to the server’s DRAM and CPU cores by directly attaching it, moving from disk to flash, and then moving from disk-based SATA and SAS protocols to PCIe with NVMe drivers, which improves matters, but not enough.

It takes too much time to read in data from PCIe flash to DRAM, and the data needs to be in DRAM or some other memory media in order for the CPU cores to get hold of it fast.

The answer that the industry seems to be agreeing on comes from the point of view that DRAM, although fast, is still too expensive to use at the multi-TB per server level. The answer is to put solid state storage on the memory channel using memory DIMMs and, although it is non-volatile, treat it as memory. Data is moved from NAND DIMM for example, to DRAM DIMM, using memory transfer load and store instructions, and not traditional slow IO commands via the operating system stack.

A SAS MLC SSD read takes roughly as little as 150 microsecs. An NVMe SSD read can take 120 microsecs. An NVDIMM-F read can take 5-10 microseconds, more than 10-20 times faster. Here is a chart of NVDIMM types:

Xitore_NVDIMM_chart

Xitore NVDIMM chart (click for larger version)

The Memory1 and NVDIMM-X (Xitore developing NVDIMM tech) data in the chart is not germane to the argument we’re pursuing here, so just ignore them.

Say an NVDIM-F takes 10 microsecs; that’s 10,000 nanosecs, and a DDR4 DRAM access can take 14ns, more than 700 times faster, and that is slow to a CPU with a Level 1 cache access taking 0.5ns. The latency needed for a PCIe SSD access: 30 microsecs for a write and 110 microsecs for a read, with Micron 9100 NVMe PCIe SSDs. This means that the Micron NVMe SSD takes 11 times longer than the NVDIMM-F to access data. (These figures are illustrative and specific product mileage may vary, as they say.)

Now bring post-NAND media into the equation, such as Intel and Micron’s 3D XPoint. It has 7 microsecs read latency, almost 16 times faster than the Micron NVMe SSD. And that’s in its initial v1.0 form.

These numbers are encouraging the non-volatile media and drive vendors to evangelise the idea of NAND and XPoint DIMMS (and other media such as ReRAM) to the server vendors. Your servers, they say, will be able to support more VMs and have those VMs run faster because IO waits can be side-stepped, by treating fast storage media as memory.

There is a need for system and application software to change so that it too stops issuing time-consuming IO commands and does memory load-store commands instead, as the NVDIMM non-volatile media can be addressed as memory, with byte-level and not block-level addressing.

Let’s imagine a Gen10 ProLiant server uses such NVDIMMs. It can’t do so until the operating systems support NVDIMM-style load-store IO, and until key system and application SW vendors, like web browser, database, mail and collaboration software vendors, support this new memory-style IO as well or are convincingly set on doing so.

HPE, and Cisco and Dell and other server vendors would have to understand what the ratio of NVDIMM to DRAM capacity would have to be. They would need to be able to calculate the ratio of such NVDIMM capacity to PCIe flash and SAS/SATA disk capacity, and all this means running test workloads through prototype systems, analysing the results and optimising system components to balance performance, power draw, temperature and cost.

The development of the next generation of servers is going to become much more complex than that of the existing generation, as all this NVDIMM-related hardware and software complexity gets added on to the development of the expected and generally well-understood and traditional CPU, DRAM, IO adapter, etc, developments. We can expect Intel to throw FPGAs from its Altera business into the mix as well, to speed specific application workloads.

Server vendors have a hard development job to do, but if they get it right, much more powerful servers should result, and we’ll all like that. ®

Similar topics


Other stories you might like

  • Lonestar plans to put datacenters in the Moon's lava tubes
    How? Founder tells The Register 'Robots… lots of robots'

    Imagine a future where racks of computer servers hum quietly in darkness below the surface of the Moon.

    Here is where some of the most important data is stored, to be left untouched for as long as can be. The idea sounds like something from science-fiction, but one startup that recently emerged from stealth is trying to turn it into a reality. Lonestar Data Holdings has a unique mission unlike any other cloud provider: to build datacenters on the Moon backing up the world's data.

    "It's inconceivable to me that we are keeping our most precious assets, our knowledge and our data, on Earth, where we're setting off bombs and burning things," Christopher Stott, founder and CEO of Lonestar, told The Register. "We need to put our assets in place off our planet, where we can keep it safe."

    Continue reading
  • Conti: Russian-backed rulers of Costa Rican hacktocracy?
    Also, Chinese IT admin jailed for deleting database, and the NSA promises no more backdoors

    In brief The notorious Russian-aligned Conti ransomware gang has upped the ante in its attack against Costa Rica, threatening to overthrow the government if it doesn't pay a $20 million ransom. 

    Costa Rican president Rodrigo Chaves said that the country is effectively at war with the gang, who in April infiltrated the government's computer systems, gaining a foothold in 27 agencies at various government levels. The US State Department has offered a $15 million reward leading to the capture of Conti's leaders, who it said have made more than $150 million from 1,000+ victims.

    Conti claimed this week that it has insiders in the Costa Rican government, the AP reported, warning that "We are determined to overthrow the government by means of a cyber attack, we have already shown you all the strength and power, you have introduced an emergency." 

    Continue reading
  • China-linked Twisted Panda caught spying on Russian defense R&D
    Because Beijing isn't above covert ops to accomplish its five-year goals

    Chinese cyberspies targeted two Russian defense institutes and possibly another research facility in Belarus, according to Check Point Research.

    The new campaign, dubbed Twisted Panda, is part of a larger, state-sponsored espionage operation that has been ongoing for several months, if not nearly a year, according to the security shop.

    In a technical analysis, the researchers detail the various malicious stages and payloads of the campaign that used sanctions-related phishing emails to attack Russian entities, which are part of the state-owned defense conglomerate Rostec Corporation.

    Continue reading
  • FTC signals crackdown on ed-tech harvesting kid's data
    Trade watchdog, and President, reminds that COPPA can ban ya

    The US Federal Trade Commission on Thursday said it intends to take action against educational technology companies that unlawfully collect data from children using online educational services.

    In a policy statement, the agency said, "Children should not have to needlessly hand over their data and forfeit their privacy in order to do their schoolwork or participate in remote learning, especially given the wide and increasing adoption of ed tech tools."

    The agency says it will scrutinize educational service providers to ensure that they are meeting their legal obligations under COPPA, the Children's Online Privacy Protection Act.

    Continue reading
  • Mysterious firm seeks to buy majority stake in Arm China
    Chinese joint venture's ousted CEO tries to hang on - who will get control?

    The saga surrounding Arm's joint venture in China just took another intriguing turn: a mysterious firm named Lotcap Group claims it has signed a letter of intent to buy a 51 percent stake in Arm China from existing investors in the country.

    In a Chinese-language press release posted Wednesday, Lotcap said it has formed a subsidiary, Lotcap Fund, to buy a majority stake in the joint venture. However, reporting by one newspaper suggested that the investment firm still needs the approval of one significant investor to gain 51 percent control of Arm China.

    The development comes a couple of weeks after Arm China said that its former CEO, Allen Wu, was refusing once again to step down from his position, despite the company's board voting in late April to replace Wu with two co-chief executives. SoftBank Group, which owns 49 percent of the Chinese venture, has been trying to unentangle Arm China from Wu as the Japanese tech investment giant plans for an initial public offering of the British parent company.

    Continue reading
  • SmartNICs power the cloud, are enterprise datacenters next?
    High pricing, lack of software make smartNICs a tough sell, despite offload potential

    SmartNICs have the potential to accelerate enterprise workloads, but don't expect to see them bring hyperscale-class efficiency to most datacenters anytime soon, ZK Research's Zeus Kerravala told The Register.

    SmartNICs are widely deployed in cloud and hyperscale datacenters as a means to offload input/output (I/O) intensive network, security, and storage operations from the CPU, freeing it up to run revenue generating tenant workloads. Some more advanced chips even offload the hypervisor to further separate the infrastructure management layer from the rest of the server.

    Despite relative success in the cloud and a flurry of innovation from the still-limited vendor SmartNIC ecosystem, including Mellanox (Nvidia), Intel, Marvell, and Xilinx (AMD), Kerravala argues that the use cases for enterprise datacenters are unlikely to resemble those of the major hyperscalers, at least in the near term.

    Continue reading

Biting the hand that feeds IT © 1998–2022