Off-Prem

Apple built custom servers and OS for its AI cloud

Mashup of iOS and macOS runs on homebrew silicon, with precious little for sysadmins to scry


WWDC Apple has revealed it created its own datacenter stack – servers using its in-house silicon and operating system – at its Worldwide Developer Conference (WWDC) on Monday.

Cupertino hasn't actually announced the servers or OS (and never addressed rumors of its plan to make datacenter-grade processors). Instead, references to the chips and OS can be found scattered across the blizzard of announcements about AI features and product updates.

Those AI features rely on what Apple's called "Private Cloud Compute" – an off-device environment where the iGiant runs "larger, server-based models" that do AI better than the models Cupertino loads onto its iThings.

Apple describes the devices in Private Cloud Compute as "custom-built server hardware that brings the power and security of Apple silicon to the datacenter." Cupertino also uses the term "compute node," but it's unclear if that’s a synonym for "server." Apple has further confused matters by discussing a "Private Cloud Compute cluster" as being pressed into services when iThing users turn to the cloud for AI resources unavailable on their devices.

Whatever the correct term for the machines, and their configuration, Apple says they use "the same hardware security technologies used in iPhone, including the Secure Enclave and Secure Boot."

The machines run a new operating system that Apple's described as "a hardened subset of the foundations of iOS and macOS tailored to support Large Language Model (LLM) inference workloads while presenting an extremely narrow attack surface."

That OS omits "components that are traditionally critical to datacenter administration, such as remote shells and system introspection and observability tools," Apple wrote. Even the kind of telemetry needed by the site reliability engineers who keep Apple's cloud running has evidently been minimized to offer "only a small, restricted set of operational metrics" – so that info processed by Apple's models is inaccessible to humans other than the end-users who provide it. In other words, cloud sysadmins can't access personal info "even when working to resolve an outage or other severe incident."

Apple has not otherwise revealed any information about the servers' CPUs.

The presence of the same Secure Enclave and Secure Boot tech as used in the iPhone suggests the silicon shares some elements of A-series designs Apple uses in its smartphones and lower-end tablet computers.

The A-series boasts a 16-core neural engine – the same number of cores found in the recently announced M4 processor offered in the iPad Pro. It's unclear exactly how the two engines compare.

The most recent iPhone chip – the A17 – uses Arm's v8.6A instruction set. The M4 is thought to use the more modern v9.4.

Maybe Apple cooked a custom chip for these servers. It certainly operates at a scale which makes doing so feasible.

Whatever is inside, Apple's use of Arm-based silicon for AI servers is yet more evidence – if any were needed – that the Arm architecture is ready for datacenter duty in demanding applications.

AWS, Google, Oracle and Microsoft all offer Arm-powered servers in their public clouds for general purpose workloads, and tout them as offering superior price/performance compared to chips from Intel and AMD on some jobs.

Surely Apple would not be betting its next-gen AI on silicon that isn't ready to deliver its promised integration of cloudy and on-device action?

In true Apple fashion, we have one more thing to note: the absence of anything associated with these cloud servers that suggests the company intends to return to the business of selling to your datacenter, a field it abandoned over a decade back.®

Send us news
12 Comments

Apple: Since you care about yOuR pRiVaCy, we'll train our AI on made-up emails

It's LLMs all the way down

Everything you need to get up and running with MCP – Anthropic's USB-C for AI

Wrangling your data into LLMs just got easier, though it's not all sunshine and rainbows

Nvidia rolls out NeMo microservices to help AI help you help AI

Smarter agents, continuous updates, and the eternal struggle to prove ROI

<em>El Reg's</em> essential guide to deploying LLMs in production

Running GenAI models is easy. Scaling them to thousands of users, not so much

As ChatGPT scores B- in engineering, professors scramble to update courses

Now that AI is invading classrooms and homework assignment, students need to learn reasoning more than ever

We’re calling it now: Agentic AI will win RSAC buzzword Bingo

All aboard the hype train

Generative AI is not replacing jobs or hurting wages at all, economists claim

'When we look at the outcomes, it really has not moved the needle'

Vector search is the new black for enterprise databases

Software slingers from Redis to Teradata are bolting on smarts to stay relevant in GenAI era

Meta to feed Europe's public posts into AI brains again

Who said this opt-out approach is OK with GDPR, was it Llama 4, hmm?

Dems fret over DOGE feeding sensitive data into random AI

Using LLMs to pick programs, people, contracts to cut is bad enough – but doing it with Musk's Grok? Yikes

Ex-NSA chief warns AI devs: Don’t repeat infosec’s early-day screwups

Bake in security now or pay later, says Mike Rogers

Today's LLMs craft exploits from patches at lightning speed

Erlang? Er, man, no problem. ChatGPT, Claude to go from flaw disclosure to actual attack code in hours