Here’s what it’ll take for Nvidia and other US chipmakers to flog AI chips in China
Jensen be limbo, Jensen be quick, Jensen go under the Uncle Sam’s limbo stick
Over the past few years, Uncle Sam has made it progressively harder for US chip designers to flog their AI wares in China. But not impossible.
Initially, the rules capped the high-speed interconnects used to stitch multiple GPUs together. By 2023, the bar had been lowered to cap processor performance.
Each time the rules have gotten tighter, Nvidia, AMD, and others have risen to the challenge, quietly unveiling sanctions-compliant versions of their flagship products.
In April, many of these chips, including Nvidia's H20 and AMD's MI308, were effectively barred from sale in China when Uncle Sam once again lowered the boom to limit memory and I/O bandwidth.
But Nvidia just keeps charging forward into the Middle Kingdom.
The chipmaker's latest GPU to limbo under the US Commerce Department's bar will reportedly be based on the RTX Pro 6000-series server chips.
Announced at GTC in March, the parts boast up to 4 petaFLOPs of sparse performance at 4-bit floating point precision and 96GB of GDDR7, good for 1.6TB/s of memory bandwidth.
That performance will need to be cut back considerably for the China-spec version, which is reportedly called the RTX Pro 6000D. Details regarding the chip remain light, but the way US export controls have been written means any sanctions-compliant chips are going to have to follow the same recipe.
Until we settle on a new product design and receive approval from the US government, we are effectively foreclosed from China's $50 billion datacenter market
Nvidia, for its part, isn't sugarcoating the challenge. "We are still evaluating our limited options. Until we settle on a new product design and receive approval from the US government, we are effectively foreclosed from China's $50 billion datacenter market," a Nvidia spokesperson told The Register.
Building a sanctions compliant AI accelerator in 2025
So you want to build a sanctions-compliant accelerator in 2025? The first thing you'll want to do is avoid high-bandwidth memory, and we're not just talking HBM, either. Strapping too much GDDR7 or LPDDR5x memory to your chip could push it over the edge.
Why is Uncle Sam suddenly concerned with how fast your memory is? When it comes to AI inference — the act of actually using the model — memory bandwidth is usually the bottleneck.
This is the requirement that derailed all those billions of dollars of H20, MI308, and Gaudi shipments to China.
"H20 integrated circuits and any other circuits achieving the H20's memory bandwidth, interconnect bandwidth, or combination thereof," are now subject to US export controls, Nvidia explained in a recent regulatory filing.
This is where things get a bit fuzzy. Unlike with previous export controls, the Commerce Department's Bureau of Industry and Security (BIS) hasn't issued specific guidance on how much I/O or memory bandwidth is too much.
However, an April email from Intel to its Chinese customers, reviewed by the Financial Times, reportedly set these limits at 1.4 TB/s of DRAM bandwidth, 1.1 TB/s of I/O bandwidth, or a combined bandwidth of 1.7 TB/s.
These limits preclude the sale of just about any existing accelerator built using HBM, which itself is already on a rocky footing with US export czars. This is likely why Nvidia CEO Jensen Huang was recently quoted as saying it's the end of the line for its Hopper-based chips in China — they were only ever designed with HBM in mind.
Nvidia's Blackwell-based RTX Pro graphics cards don't use HBM, instead favoring consumer-oriented GDDR7 memory. The server edition of the chip tops out at 96GB and 1.6TB/s of bandwidth.
Nvidia would need to shave off 200GB/s bandwidth to get it under the reported limit, but that's still a decent amount of bandwidth, especially if you're planning to deploy mixture of experts (MoE) models, like Alibaba's Qwen3-235B or DeepSeek's V3 or R1. (If you're curious why, check out our deep dive on MoE architectures here.)
I/O doesn't appear to be an issue, as the RTX Pro 6000's 16 lanes of PCIe 5.0 cap out at 128GB/s of bidirectional bandwidth.
A whole lot of wasted sand
Right, the memory bit's taken care of. Next, you're going to need a lot of silicon. You won't be using most of it, but the bigger your chip, the higher your performance is allowed to be.
In the case of the RTX Pro 6000, we know its die area is 750mm2. So using that info and its 4-bit width, we can calculate how much performance Nvidia will have to shave to sell it in China according to the current requirements (which haven't changed since 2023).
This graphic from the Center for Strategic and International Studies does a nice job of illustrating the trade-offs. The vertical axis measures TPP, or the performance of the chip. The horizontal axis, meanwhile, measures performance per square mm of silicon or performance density (PD). Ideally, you'd want a chip that has both a high TPP and PD, but because of US export restrictions, the sweet spot for building a single chip that can limbo under US export controls is actually somewhere in the middle.

The sweet spot for modern AI accelerators is right between 2400 TPP and 3.2 PD. Image credit CSIS - Click to enlarge
Specifically, let's aim for TPP of less than 2,400 and performance density (PD) under 3.2.
To calculate TPP and PD you need to know just three variables:
- Advertised teraOPS or teraFLOPS
- The "bit width" or precision of those OPS or FLOPS
- The total die area of the chip in mm2
To find TPP, multiply teraOPS by its bit width, aka precision. PD, meanwhile, can be found by dividing TPP by your chip's die area.
In the case of the RTX Pro 6000 the math looks a bit like this
- 4000 teraOPS x 4-bit width = 16,000 TPP
- 16,000 TPP / 750mm2 die area = 21.3 PD
Obviously, that's not gonna fly with US customs enforcement, so we've got to get that under the limits. Since we know the RTX Pro 6000's die area is 750mm2, and the bit-width is 4, finding the 6000D's max theoretical performance is as simple as solving for X.
- 3.1 PD * 750mm2 = 2,325 TPP
- 2,325 TPP / 4-bits = 581 teraOPS
In other words, in order to sell the RTX Pro 6000 in China, Nvidia will have to shave off about 85 percent of its performance.
No guarantees
Of course, even if you do everything right, you still could end up writing off billions of dollars worth of inventory and sales the next time the rules change.
After the latest round of AI performance caps were announced, Nvidia warned it'd take a $5.5 billion charge related to H20 inventory, purchase commitments, and related reserves in the first quarter of its 2026 fiscal year.
The total hit could be several times that, similar to what we saw with AMD, which booked an $800 million charge on MI308 accelerators but also expects to miss out on $1.5 billion in revenues in 2025 due to the updated trade restrictions.
- Oracle's $40B Nvidia hardware haul may be too hot for OpenAI's Abilene, Texas DC to handle
- Turns out using 100% of your AI brain all the time isn't most efficient way to run a model
- Nvidia ain't done with x86 as it taps Intel Xeons to babysit GPUs
- Nvidia CEO Jensen Huang labels US GPU export bans 'precisely wrong' and 'a failure'
US chip companies aren't exactly thrilled with these changes. At Computex last week, Nvidia CEO Jensen Huang took the opportunity to decry Uncle Sam's obsession with hoarding its tech, calling them "precisely wrong," and "a failure."
Besides depriving shareholders of profits, Huang argued that denying China access to cutting-edge tech was holding AI advancement back and will ultimately harm humanity.
Part of Huang's argument is rooted in the fact that roughly half of the world's AI researchers are located in China. Cutting off their access to Nvidia hardware, he argued, effectively cuts the rest of the world off from their innovations.
Diminishing returns
Even if you do everything right and in limbo under Uncle Sam's licensing requirements, those performance caps mean it's only a matter of time before homegrown Chinese accelerators overtake them.
As our sibling site The Next Platform recently discussed, Huawei's Ascend-series of AI accelerators already offer better performance, and now that Nvidia is barred from selling its H20 accelerators in the Middle Kingdom, any advantage from higher memory bandwidth is null and void.
US chip designers may be able to compete on volume or software compatibility, but eventually China's AI supply chains will catch up. ®