Apple's on-device gen AI for the iPhone should surprise no-one. The way it does it might

They have the hardware, now they just need models that don't, ah... suck?

Comment Apple’s efforts to add generative AI to its iDevices should surprise no one, but Cupertino's existing uses of of the tech, and the constraints of mobile hardware, suggest it won’t be a big feature of iOS in the near future.

Apple has not joined the recent wave of generative AI boosterism, even generally avoiding the terms "AI" or "Artificial Intelligence" in its recent keynote presentations compared to many businesses. Yet machine learning has been, and continues to be, a key capability for Apple – mostly in the background in the service of subtle improvements to the user experience.

Apple's use of AI to handle images is one example of the technology in work in the background. When iThings capture photos, machine learning algorithms go to work to identify and tag subjects, running optical character recognition, and adding links.

In 2024 that sort of invisible AI doesn’t cut it. Apple’s rivals are touting generative AI as an essential capability for every device and application. According to a recent Financial Times report, Apple has been quietly buying AI companies and developing its own large language models to ensure it can deliver.

Apple's hardware advantage

Neural processing units (NPUs) in Apple’s homebrew silicon handle its existing AI implementations. Apple has employed the accelerators, which it terms “Neural Engines” since the debut of 2017’s A11 system-on-chip and uses them to handle smaller machine learning workloads to free a device's CPU and GPU for other chores.

Apple's NPUs are particularly powerful. The A17 Pro found in the iPhone 15 Pro is capable of pushing 35 TOPS, double that of its predecessor, and about twice that of some NPUs Intel and AMD offer for use in PCs.

Qualcomm's latest Snapdragon chips are right up there with Apple's in terms of NPU perf. Like Apple, Qualcomm also has years of NPU experience in mobile devices. AMD and Intel are relatively new to the field.

Apple hasn't shared floating point or integer performance for the chip's GPU, although it has touted its prowess running games, like the Resident Evil 4 Remake and Assassin's Creed Mirage. This suggests that computational power isn't the limiting factor for running bigger AI models on the platform.

Further supporting this is the fact that Apple's M-series silicon, used in its Mac and iPad lines, has proven particularly potent for running AI inference workloads. In our testing, given adequate memory — we ran into trouble with less than 16GB — a now three-year-old M1 Macbook Air was more than capable of running Llama 2 7B at 8-bit precision and was even snappier with a 4-bit quantized version of the model. By the way, if you want to try this on your M1 Mac, makes running Llama 2 a breeze.

Where Apple may be forced to make hardware concessions is with memory.

Generally speaking, AI models need about a gigabyte of memory for every billion parameters, when running at 8-bit precision. This can be halved either by dropping to lower precision, something like Int-4, or by developing smaller, quantized models.

Llama 2 7B has become a common reference point for AI PCs and smartphones due to its relatively minor footprint and computation requirements when running small batch sizes. Using 4-bit quantization, the model's requirements can be cut to 3.5GB.

But even with 8 GB of RAM on the iPhone 15 Pro, we suspect Apple's next gen of phones may need more memory, or the models will need to be smaller and more targeted. This is likely one of the reasons that Apple is opting to develop its own models rather than co-opting models like Stable Diffusion or Llama 2 to run at Int-4, as we've seen from Qualcomm.

There's also some evidence to suggest that Apple may have found a way around the memory problem. As spotted by the Financial Times, back in December, Apple researchers published [PDF] a paper demonstrating the ability to run LLMs on-device using flash memory.

Expect a more conservative approach to AI

When Apple does introduce AI functionality on its desktop and mobile platforms, we expectit to take a relatively conservative approach.

Turning Siri into something folks don't feel needs to be spoken to like a pre-school child seems an obvious place to start. Doing that could mean giving an LLM the job of parsing input into a form that Siri can more easily understand, so the bot can deliver better answers.

Siri could become less easily confused if you phrase a query in a roundabout way, resulting in more effective responses.

In theory, this should have a couple of benefits. The first being Apple should be able to get away with using a much smaller model than something like Llama 2. The second, is that it should largely avoid the issue of the LLM producing erroneous responses.

We could be wrong, but Apple has a track record of being late to implement the latest technologies, but then finding success where others have failed by taking time to refine and polish features until they are actually useful.

And for what it’s worth, generative AI is yet to prove it’s a hit: Microsoft's big chatbot bet to breathe life into no one's favorite search engine Bing hasn't translated into a major market share increase.

Apple, meanwhile, took the crown as 2024’s top smartphone vendor while deploying only invisible AI. ®

Send us news

RISC-V AI chip upstart Rivos plans to undercut Nvidia, helped by a quarter-billion in VC lucre

With Apple lawsuit behind it, focussed on finalizing its designs

Google Cloud chief is really psyched about this AI thing

We're on a highway to ML

What's up with AI lately? Let's start with soaring costs, public anger, regulations...

'Obtaining genuine consent for training data collection is especially challenging' industry sages say

AI spam is winning the battle against search engine quality

'Not all AI content is spam, but I think right now all spam is AI content'

US House mulls forcing AI makers to reveal use of copyrighted training data

Proposed law doesn't include any ban on use of such stuff to build models, mind you

OpenAI CEO wants UAE into his plan for a global AI cabal

Asking for emir few billion bucks to pay for lots of fabs, datacenters, and nuclear power plants

Indian PM's 25-year roadmap laid out with help from AI

AI is so good at drawing pictures and driving cars, why not let it govern a country?

World is finally buying more phones and prices are rising

Someone forgot to tell Apple and Samsung as Chinese brands rebound

AI could crash democracy and cause wars, warns Japan's NTT

Calls for ecosystem in which AIs keep other AIs in check, and lots more regulation

How to coax ChatGPT into making better predictions: Get it to tell tales from the future

'Something is stopping it, even though it clearly can do it'

AI PCs are here but a killer application for biz users? Nope

Resist the pressure to jump on the bandwagon just yet warns, warns Forrester

Psst, hey. It's the NSA. You want some AI security advice?

You can trust us, we're the good guys