Ampere: Cloud biz buy-ins prove our Arm server CPUs are the real deal
Startup teases 128+ core chip, disses Xeon and Epyc, unsurprisingly
Interview After two years of claiming that its Arm-powered server processors provide better performance and efficiency for cloud applications than Intel or AMD's, Ampere Computing said real deployments by cloud providers and businesses are proving its chips are the real deal.
The Silicon Valley startup held its Annual Strategy and Product Roadmap Update last week to ostensibly give a product roadmap update. But the only update was the news that Ampere's 5nm processor due later this year is called Ampere One, it's sampling that with customers, and it will support PCIe Gen 5 connectivity and DDR5 memory.
Ampere CEO, and former Intel president, Renee James showed off the upcoming Ampere One chip during the roadmap update ... Click to enlarge
What the video update really felt like was a message to investors and the industry that Ampere's high-core-count Altra processors are deployed at several major cloud providers and are already making a big difference for some businesses. After all, Ampere, which has raised a lot of cash from Oracle, confidentially filed for an initial public offering in April.
In an interview with The Register, Ampere chief product officer Jeff Wittich said the startup made customer testimonials the main topic of the presentation to prove that Ampere has executed on its roadmap so far and delivered the goods, literally speaking.
"The real story here is this stuff's real. We've got people using it every day. It's actually changing their businesses. It's actually impacting people's lives. It's easy to get access to it now. And that became a big part of the story," said Wittich, who previously worked at Intel for 15 years.
The name drops in Ampere's video included Microsoft Azure, Oracle Cloud, Tencent Cloud, Equinix Metal, Alibaba Cloud, UCloud, and JD Cloud, all of which have now launched instances powered by Ampere's Altra chips. German cloud provider Hetzner is planning Ampere-based services, and Cloudflare has been using Ampere chips in servers to handle internet requests faster than x86 silicon.
The roadmap update included testimonials from autonomous vehicle startup Cruise, which said Ampere's processors were the only ones on the market that could serve its high-throughput needs. Another booster in the video was the Oracle Red Bull Formula One racing team, which said it used Oracle's Ampere A1 Compute instance to increase the number of simulations it runs to test the aerodynamics of F1 cars by around 25 percent. There was also biomedical software firm Project Ronin, which said it turned to Ampere for its "cost-effective" computing.
Other companies that gave lip service to Ampere included server maker Supermicro, storage provider DataDirect Networks, Chinese tech giant Baidu, and cloud hosting service GleSYS. And that still doesn't cover all the organizations that Ampere said are supporting its chips.
While Ampere is getting loud about momentum with customers and partners, the startup is staying quiet about any financial details for now. The most we could get from Wittich is that Ampere saw "massive growth" in 2021 and that revenue is growing this year.
"We're not at a million CPUs a year yet, but that's the trajectory we're on, to get there as rapidly as possible so that we really do have the scale that you need for this type of an operation. That's where we're looking, and it's not in the distant future," he said.
Claims of better performance scaling than Intel, AMD
One major selling point of Ampere's Altra microprocessors is the fact that they can pack more cores than the 64-core maximum of AMD's Epyc and the 40-core maximum of Intel's Xeon. Ampere eclipsed both companies with the 2020 release of the 80-core Altra, and then it went even further in 2021 with the release of the 128-core Altra Max.
These higher core counts make the Altra processors well-suited for the high-density needs of cloud computing, but Wittich said what really makes his biz's processors better than Epyc or Xeon is that they have less performance variability between each core.
A graph Ampere uses to claim that its chips have far less performance variability than x86 ones. Click to enlarge
Wittich claimed that performance variability is an issue in Xeon and Epyc processors: a software thread running on one CPU core may run slower than if it was on another core, or may run faster, or the same. The result is that users can't always be sure how fast or slow their software will run on these hosts, though Wittich said cloud providers have developed ways to smooth out these inconsistencies and mitigate the causes of any slowdowns.
"Cloud providers try and move users away from each other. They try and detect this, migrate people. They try and cap the amount of resources that one person can have. It's not foolproof, and all those things add complexity and overhead," he opined.
Wittich said these kinds of fixes may hide performance issues of x86 processors from users, but it means that a significant amount of a cloud provider's capacity is "totally wasted," which results in extra costs.
Ampere's Altra processors avoid these variability issues for a few reasons, according to Wittich. For one, all the cores are single-threaded: one hardware thread per CPU core, so performance is guaranteed. Meanwhile, Xeon and Epyc processors typically run two hardware threads through each CPU core as they support simultaneous multi-threading (SMT). If those two hardware threads aren't in contention, you can get good performance, and if they contend, a slowdown may occur. SMT can be configured by the host server operator.
Wittich said the Altra chips are also more energy efficient, which allows them to sustain the same clock frequency at all times, unlike Xeon or Epyc.
"That's one of the reasons why we're able to scale to really high utilization and not drop off. We've got the power envelope to do it," he said.
A graph Ampere uses to claim that its chips have better peformance scaling than x86 chips. Click to enlarge.
Another factor that makes performance consistent between cores, according to Wittich, is that Ampere designed its processors to have larger L1 and L2 caches that are private to each CPU core, which keeps the cores primed with plenty of data and code that can be rapidly accessed, thus avoiding wasting time reaching out to RAM.
"People can keep their data in the cache for longer, which means less variability in performance, because you don't have to keep going back out to memory because your neighbor wanted the rest of the cache and you evicted all of your cache contents," he said.
What also helps keep the cores even is the mesh-based interconnect used to connect all the components on Ampere's chips, Wittich added.
"We've done a lot on the mesh front to ensure that we've got really consistent performance," he said. "It's one reason why today our products are monolithic — not going to say they're going to be monolithic forever — but what we will do is make sure that when our products move to a disaggregated approach, that our users still have a monolithic experience."
Ampere's plan to keep selling old chips alongside new ones
Later this year, Ampere plans to release a 5nm processor, named Ampere One, and the startup is promising it will come with even more cores, higher performance, and better power efficiency than its current Altra chips. Most notably, Ampere One will use a custom, Arm-compatible core designed by the startup, unlike the off-the-shelf core blueprints Ampere licensed from Arm for the Altra family.
Wittich said it was always the plan for Ampere to design its own CPU. "There are a lot of cool things that we've done that allows us to scale out to this many cores, not blow past the power budget, and maintain that really, really consistent performance that our users want," he said.
Customers have been sampling Ampere One since earlier this year, and while Wittich said the processor won't enter mass production by the end of this year, it will be available in commercial servers by then.
"Once it's in people's production data centers, running real live workloads, and we've qualified it, that, to me, is the key milestone. And so that's what we're driving to for 2022," he said. "Obviously, we'll ramp volume as fast as we can to keep pulling in more customers."
The interesting twist is that in addition to introducing chips on a yearly cadence, Ampere plans to keep selling its older processors well into the future — maybe even the next 10 years for Altra and Altra Max, Wittich suggested. This differs from Intel, which typically supports products from the last one or two generations before discontinuing them. In other words, imagine if Intel was still selling Haswell-based Xeon processors from 2014, and you get the idea for Ampere's plan.
- Glimpse of 3GHz 128-core Ampere Altra Max server processor emerges as Oracle teases more cloudy Arms
- Arm server chip maker Ampere says it's readying for an IPO
- AWS puts latest homebrew Graviton3 Arm processor in production
- Arm CPU ran on electricity generated by algae for over six months
Wittich said Ampere is doing this because it's what customers want and it's necessary so that customers don't get left behind if they don't need to update to the latest chip.
"Our customers want the best of both worlds. So they want us to innovate and have a new processor every single year that has new features, but they don't necessarily want to use that new processor for everything, not on day one," Wittich said. It sounds somewhat contrary, telling customers it's OK to keep buying the old boring chips instead of the new shiny ones. But Wittich said it makes the most sense for how customers run their datacenters.
"With all the innovation we're doing, there's going to be some innovations that really matter to some people, and then there's going to be other people that say, 'you know, I'll wait for the wait for the one in year two. That's the basket of innovations that I cared more about,'" he said.
What Ampere will need to contend with, among other things, is the fact that both Intel and AMD plan to release their own cloud-optimized processors in the near future, with the former's Sierra Forrest chips expected in 2024 and the latter's Bergamo chips arriving in 2023. But we do like the idea of a world that isn't dominated by only two major server CPU providers. ®