Untangling Meta's plan for its homegrown AI chips, set to actually roll out this year

So that's where all the laid-off semiconductor engineers went!

After years of development, Meta may finally roll out its homegrown AI accelerators in a meaningful way this year.

The Facebook empire confirmed its desire to supplement deployments of Nvidia H100 and AMD MI300X GPUs with its Meta Training Inference Accelerator (MTIA) family of chips this week. Specifically, Meta will deploy an inference-optimized processor, reportedly codenamed Artemis, based on the Silicon Valley giant's first-gen parts teased last year.

"We're excited about the progress we've made on our in-house silicon efforts with MTIA and are on track to start deploying our inference variant in production in 2024," a Meta spokesperson told The Register on Thursday.

"We see our internally developed accelerators to be highly complementary to commercially available GPUs in delivering the optimal mix of performance and efficiency on Meta-specific workloads," the rep continued. Details? Nope. The spokesperson told us: "We look forward to sharing more updates on our future MTIA plans later this year."

We're taking that to mean the second-gen inference-focused chip is rolling out widely, following a first-gen lab-only version for inference, and we may find out later about parts intended primarily for training or training and inference.

Meta has become one of Nvidia and AMD's best customers as its deployment of AI workloads has grown, increasing its need and use of specialized silicon to make its machine-learning software run as fast as possible. Thus, the Instagram giant's decision to develop its own custom processors isn't all that surprising.

In fact the mega-corp, on the face of it, is relatively late to the custom AI silicon party in terms of real-world deployment. Amazon and Google have been using homegrown components to accelerate internal machine-learning systems, such as recommender models, and customer ML code for some years. Meanwhile, Microsoft revealed its homegrown accelerators last year.

But beyond the fact that Meta is rolling out a MTIA inference chip at scale, the social network hasn't disclosed its precise architecture nor which workloads it's reserving for in-house silicon and which it's offloading to AMD and Nvidia's GPUs.

It's likely Meta will run established models on its custom ASICs to free up GPU resources for more dynamic or evolving applications. We've seen Meta go down this route before with custom accelerators designed to offload data and compute intensive video workloads.

As for the underlying design, the industry watchers at SemiAnalysis tell us that the new chip is closely based on the architecture in Meta's first-gen parts.

Stepping stones

Announced in early 2023 after three years of development, Meta's MTIA v1 parts, which our friends at The Next Platform looked at last spring, were designed specifically with deep-learning recommender models in mind.

The first-gen chip was built around a RISC-V CPU cluster and fabbed using TSMC's 7nm process. Under the hood, the component employed an eight-by-eight matrix of processing elements each equipped with two RV CPU cores, one of which is outfitted with vector math extensions. These cores fed from a generous 128MB of on-chip SRAM and up to 128GB of LPDDR5 memory.

As Meta claimed last year, the chip ran at 800 MHz and topped out at 102.4 trillion operations per second of INT8 performance or 51.2 teraFLOPS at half precision (FP16). By comparison, Nvidia's H100 is capable of nearly four petaFLOPS of sparse FP8 performance. While nowhere near as powerful as either Nvidia or AMD's GPUs, the chip did have one major advantage: Power consumption. The chip itself had a thermal design power of just 25 watts.

According to SemiAnalysis, Meta's latest chip boasts improved cores and trades LPDDR5 for high-bandwidth memory packaged using TSMC's chip-on-wafer-on-substrate (CoWoS) tech.

Another notable difference is Meta's second-gen chip will actually see widespread deployment across its datacenter infrastructure. According to the Facebook titan, while the first-gen part was used to run production advertising models, it never left the lab.

Chasing artificial general intelligence

Custom parts aside, the Facebook and Instagram parent has dumped billions of dollars on GPUs in recent years to accelerate all manner of tasks ill-suited to conventional CPU platforms. However, the rise of large language models, such as GPT-4 and Meta's own Llama 2, have changed the landscape and driven the deployment of massive GPU clusters.

At the scale Meta operates, these trends have necessitated drastic changes to its infrastructure, including the redesign of several datacenters to support the immense power and cooling requirements associated with large AI deployments.

And Meta's deployments are only going to get larger over the next few months as the company shifts focus from the metaverse to the development of artificial general intelligence. Supposedly, work done on AI will help form the metaverse or something like that.

According to CEO Mark Zuckerberg, Meta plans to deploy as many as 350,000 Nvidia H100s this year alone.

The biz also announced plans to deploy AMD's newly launched MI300X GPUs in its datacenters. Zuckerberg claimed his corporation would end the year with the equivalent computing power of 600,000 H100s. So clearly Meta's MTIA chips won't be replacing GPUs anytime soon. ®

Send us news

Meta faces multiple complaints in Europe over plans to train AI on user data

Opt out, if you can, or prepare for your posts to be ingested

Meta won't train AI on Euro posts after all, as watchdogs put their paws down

Facebook parent calls step forward for privacy a 'step backwards'

Meta will use your social media posts to train its AI. Europe gets an opt out

What's German for 'thank goodness for actually useful privacy regulations'?

Payoff from AI projects is 'dismal', biz leaders complain

No wonder most orgs are slowing their spending

AMD reveals the MI325X, a 288GB AI accelerator built to battle Nvidia's H200

House of Zen hopes to catch its AI rival by moving to a yearly release cadence

Meta accused of trying to discredit ad researchers

As more than 70 civil society groups sign open letter slamming 'intimidation'

AI smartphones must balance promise against hype and privacy concerns

Color us shocked: 66% of Apple users said they wouldn't switch for any reason

Meta algorithms push Black people more toward expensive universities, study finds

Just as we saw with housing, Facebook giant's advertising system seems to treat Whites and POC differently

AI chip sales predicted to jump by a third this year – then cool off

Gartner gives us a ray of hope amid ongoing hype and pressure to buy more hardware

DuckDuckGo AI Chat promises privacy for bot conversations

There's also an off switch

China's new sanctions loophole: Use export-controlled chips <i>inside</i> the US

Oracle, Nvidia accused of working with ByteDance, others to provide access to advanced parts within America

From RAGs to riches: A practical guide to making your local AI chatbot smarter

Nine out of 10 execs recommend adding Retrieval Augmented Generation to your daily regimen