HPC

POD giveth: Is factory-built HPC the future?

Mini data-centers getting big love


Hewlett-Packard is looking to steal a step on everyone in the industry with their POD-Works factory approach to building new data centers or adding capacity to old ones.

PODs are HP’s term for the now-ubiquitous shipping container-style mini data centers that started catching on (at least with vendors) a few years ago. HP is taking the concept a step further by applying large-scale industrialization to the process of producing PODs in volume.

Our pal TPM provides further detail on PODs here.

HP’s POD-Works is co-located in Houston with one of its major manufacturing and distribution centers. This way it has quick access to loads of hardware, as this is a primary receiving center for HP x86 gear. It has a new facility it has purpose-built for building out, testing, and qualifying the pods before they’re shipped to the customer. I think this has the potential to change the economics of buying at scale. Why?

The process of delivering, installing, and certifying a supercomputer is a lot like building a house. Materials in the form of racks, boards, processors, memory, cables, drives, cords, and everything else are delivered to the empty data center just like concrete, wood, sheetrock, and other supplies would be delivered to an empty lot. Then a group of skilled or semi-skilled workers start the building process from the ground up.

It’s pretty labor-intensive. Just testing all of the parts can consume quite a bit of time. If you’re building a modest system with, say, two thousand nodes, you might be looking at 4,000 processors and 32,000 individual DIMMs. A system this size might put you in the middle of the pack on the Top500 list.

When the numbers get that big (and a 2,000 node box isn’t all that big these days) you’re going to see problems simply due to the scale of the system. You’ll see memory speed mismatches, or the wrong model processors in the shipment, or some components that are DOA. Even if the vendor ships the right working parts 99.9 per cent of the time, it still means that you’ll see 32 bad DIMMS and three or four bad processors.

If you’re the one building this system (either as the end customer or an integrator/vendor), you have to inspect all the stuff to make sure it’s right and then test it to make sure it works. You’ll have to ship the wrong/defective/broken parts back and get new stuff shipped out to you. Delays ensue, voices are raised and, often, tears follow.

Contrast this to building the same system in a dedicated factory right next door to a massive system distribution facility. You received the wrong part or have a busted one? Walk next door and get a new one. Components not working as they should? The folks at the factory have built lots of these things in every imaginable configuration and have seen it all – they’ll get it working. Testing and certification can also take place at the factory with customer representatives flown in to participate in the process.

HP estimates that the POD factory will result in an 8x speed-up in system deployment. They also say that it is 37 per cent more efficient and 45 per cent less expensive than a typical data center. I’m not sure exactly how that’s measured, but it’s obvious to me that there would be significant savings from this model.

Of course, there are some trade-offs inherent in this too. The first is that you’re going to get your systems in POD-sized chunks – up to 1,100U at a time. Rather than row after row of servers and storage in a single large room, you’d have POD after POD arrayed in what could be just a shell of a warehouse providing physical security and protection from the elements but little else. (Well, a restroom and drinking fountain wouldn’t be too much to ask.)

In a lot of ways, it really comes down to aesthetic and logistic considerations. A house built in a factory with standardized parts can definitely be better from a quality standpoint and be less expensive to build and own. But the trade-off comes from the constraints arising from that model – if you want something highly customized, you’re out of luck.

I would also assume that with HP PODs, if you want them populated with, say, Dell servers, you are similarly out of luck. However, in HPC systems, the vast majority of the gear is already built from standard x86 system building blocks that are easily configurable to provide for variations in needs or budgets.

I think we may be looking at the beginnings of a significant trend here – an evolution in the HPC buying model at least, assuming that customers are willing to put speedy deployment and bang for the buck ahead of customization, logistics, and looks. If the savings and speed pan out, I expect that we’ll see other vendors respond by putting together factories of their own. HP said that it'll have a POD at SC10. I’m going to take a closer look at it, video camera in hand, and drill down into the economics of it a bit more too.

If you’re interested in seeing more, HP has some short videos up showing how a POD is constructed (“Birth of an HP POD”), a look at the tech inside a pod (“Engineer tour of the HP POD”), plus there are plenty of others that you can easily find on its site. ®

Similar topics

Narrower topics


Other stories you might like

  • VMware claims 'bare-metal' performance from virtualized Nvidia GPUs
    Is... is that why Broadcom wants to buy it?

    The future of high-performance computing will be virtualized, VMware's Uday Kurkure has told The Register.

    Kurkure, the lead engineer for VMware's performance engineering team, has spent the past five years working on ways to virtualize machine-learning workloads running on accelerators. Earlier this month his team reported "near or better than bare-metal performance" for Bidirectional Encoder Representations from Transformers (BERT) and Mask R-CNN — two popular machine-learning workloads — running on virtualized GPUs (vGPU) connected using Nvidia's NVLink interconnect.

    NVLink enables compute and memory resources to be shared across up to four GPUs over a high-bandwidth mesh fabric operating at 6.25GB/s per lane compared to PCIe 4.0's 2.5GB/s. The interconnect enabled Kurkure's team to pool 160GB of GPU memory from the Dell PowerEdge system's four 40GB Nvidia A100 SXM GPUs.

    Continue reading
  • Nvidia promises annual datacenter product updates across CPU, GPU, and DPU
    Arm one year, x86 the next, and always faster than a certain chip shop that still can't ship even one standalone GPU

    Computex Nvidia's push deeper into enterprise computing will see its practice of introducing a new GPU architecture every two years brought to its CPUs and data processing units (DPUs, aka SmartNICs).

    Speaking on the company's pre-recorded keynote released to coincide with the Computex exhibition in Taiwan this week, senior vice president for hardware engineering Brian Kelleher spoke of the company's "reputation for unmatched execution on silicon." That's language that needs to be considered in the context of Intel, an Nvidia rival, again delaying a planned entry to the discrete GPU market.

    "We will extend our execution excellence and give each of our chip architectures a two-year rhythm," Kelleher added.

    Continue reading
  • Now Amazon puts 'creepy' AI cameras in UK delivery vans
    Big Bezos is watching you

    Amazon is reportedly installing AI-powered cameras in delivery vans to keep tabs on its drivers in the UK.

    The technology was first deployed, with numerous errors that reportedly denied drivers' bonuses after malfunctions, in the US. Last year, the internet giant produced a corporate video detailing how the cameras monitor drivers' driving behavior for safety reasons. The same system is now apparently being rolled out to vehicles in the UK. 

    Multiple camera lenses are placed under the front mirror. One is directed at the person behind the wheel, one is facing the road, and two are located on either side to provide a wider view. The cameras are monitored by software built by Netradyne, a computer-vision startup focused on driver safety. This code uses machine-learning algorithms to figure out what's going on in and around the vehicle.

    Continue reading

Biting the hand that feeds IT © 1998–2022