How Apple's M1 uses high-bandwidth memory to run like the clappers

Expandability traded for performance


Apple last week set the cat among Intel's pigeons with the launch of its first PCs incorporating silicon designed in-house.

The company claims its M1 Arm chip delivers up to 3.5x faster CPU performance, up to 6x faster GPU performance, up to 15x faster machine learning, and up to 2x longer battery life than previous-generation Macs, which use Intel x86 CPUs.

Let's take a closer look at how Apple uses high-bandwidth memory in the M1 system-on-chip (SoC) to deliver this rocket boost.

High-bandwidth memory (HBM) avoids the traditional CPU socket-memory channel design by pooling memory connected to a processor via an interposer layer. HBM combines memory chips and gives them closer and faster access to the CPU as the distance to the processor is only a few micrometer units. This on its own speeds data transfers.

The M1, Apple's first Mac SoC, is built by chip foundry TSMC using 16 billion transistors with 5nm technology. It includes an eight-core CPU, an eight-core GPU, a 16-core neural engine, storage controller, image signal processor, and media code/decode engines.

This Apple diagram of the M1 SoC shows two blocks of DRAM:

Apple M1 architecture

Apple M1 unified memory architecture

The SoC has access to 16GB of unified memory. This uses 4266 MT/s LPDDR4X SDRAM (synchronous DRAM) and is mounted with the SoC using a system-in-package (SiP) design. A SoC is built from a single semiconductor die whereas a SiP connects two or more semiconductor dies.

SDRAM operations are synchronised to the SoC processing clock speed. Apple describes the SDRAM as a single pool of high-bandwidth, low-latency memory, allowing apps to share data between the CPU, GPU, and Neural Engine efficiently.

In other words, this memory is shared between the three different compute engines and their cores. The three don't have their own individual memory resources, which would need data moved into them. This would happen when, for example, an app executing in the CPU needs graphics processing – meaning the GPU swings into action, using data in its memory.

The downside of this design is that expandability is traded for performance. Users cannot simply add more memory to the configuration; they cannot plug more memory DIMMs into carriers as there are no carriers and DIMM technology isn't used.

We can envisage a future in which all storage controllers, SmartNICs, and DPUs could use Arm SoCs with a pool of unified memory to run their workloads much faster than traditional x86 controllers, which are hampered by memory sockets and DIMMs.

For instance, Nebulon's Storage Processing Unit (SPU) uses dual Arm processors. Conceivably this could move to a unified memory design, giving Nebulon additional power to run its storage processing workload, and so exceed x86-powered storage controllers in performance, cost, and efficiency terms even more than it does now. ®

Similar topics


Other stories you might like

  • DuckDuckGo tries to explain why its browsers won't block Microsoft ad trackers
    Meanwhile, Tails 5.0 users told to stop what they're doing over Firefox flaw

    DuckDuckGo promises privacy to users of its Android, iOS browsers, and macOS browsers – yet it allows certain data to flow from third-party websites to Microsoft-owned services.

    Security researcher Zach Edwards recently conducted an audit of DuckDuckGo's mobile browsers and found that, contrary to expectations, they do not block Meta's Workplace domain, for example, from sending information to Microsoft's Bing and LinkedIn domains. Specifically, DuckDuckGo's software didn't stop Microsoft's trackers on the Workplace page from blabbing information about the user to Bing and LinkedIn for tailored advertising purposes. Other trackers, such as Google's, are blocked.

    "I tested the DuckDuckGo so-called private browser for both iOS and Android, yet neither version blocked data transfers to Microsoft's Linkedin + Bing ads while viewing Facebook's workplace[.]com homepage," Edwards explained in a Twitter thread.

    Continue reading
  • Despite 'key' partnership with AWS, Meta taps up Microsoft Azure for AI work
    Someone got Zuck'd

    Meta’s AI business unit set up shop in Microsoft Azure this week and announced a strategic partnership it says will advance PyTorch development on the public cloud.

    The deal [PDF] will see Mark Zuckerberg’s umbrella company deploy machine-learning workloads on thousands of Nvidia GPUs running in Azure. While a win for Microsoft, the partnership calls in to question just how strong Meta’s commitment to Amazon Web Services (AWS) really is.

    Back in those long-gone days of December, Meta named AWS as its “key long-term strategic cloud provider." As part of that, Meta promised that if it bought any companies that used AWS, it would continue to support their use of Amazon's cloud, rather than force them off into its own private datacenters. The pact also included a vow to expand Meta’s consumption of Amazon’s cloud-based compute, storage, database, and security services.

    Continue reading
  • Atos pushes out HPC cloud services based on Nimbix tech
    Moore's Law got you down? Throw everything at the problem! Quantum, AI, cloud...

    IT services biz Atos has introduced a suite of cloud-based high-performance computing (HPC) services, based around technology gained from its purchase of cloud provider Nimbix last year.

    The Nimbix Supercomputing Suite is described by Atos as a set of flexible and secure HPC solutions available as a service. It includes access to HPC, AI, and quantum computing resources, according to the services company.

    In addition to the existing Nimbix HPC products, the updated portfolio includes a new federated supercomputing-as-a-service platform and a dedicated bare-metal service based on Atos BullSequana supercomputer hardware.

    Continue reading

Biting the hand that feeds IT © 1998–2022