China's GPU contender Moore Threads reveals card that can cope with Nvidia’s CUDA

MTT S4000 GPU isn't super-fast, but the 'kilocard cluster' design supporting it looks interesting

Moore Threads, a Chinese purveyor of GPUs, has unveiled its mightiest model to date – and it may even give market leader Nvidia a little to worry about.

The MTT S4000 packs 48GB of video memory and 768GB/sec of video memory bandwidth on each card. But Moore Threads hasn't detailed core count or frequency, saying only that its homebrew Moore Threads Unified System Architecture (MUSA) powers the device. MUSA is said to be compatible with both x86 and Arm, but the silicon slinger has been cagey about its exact capabilities and composition.

Moore Threads rates the GPU's FP32 performance at 25 TFLOPs, and INT8 capabilities at 200 TOPs.

Those are numbers that won't worry Nvidia, or make Intel and AMD feel they've fallen behind.

Not that any of the three big US chipmakers have much reason to worry about Moore Threads, given the Chinese company was added to the US's entity list of companies that are persona non grata stateside. Moore Threads products are not going to appear on the shelves at Best Buy any time soon, and major hyperscalers won't want them either.

The upstart chip shop may, however, irritate Nvidia with its "MUSIFY" tool that's promised to allow easy migration of CUDA code to the MTT S4000. CUDA is Nvidia's flagship development environment for GPU-centric apps, and the company does enjoy selling integrated bundles. Patriotic Chinese developers porting code to Moore Threads is entirely plausible.

As are scenarios in which Chinese devs use the devices to build AI services: the MTT S4000 supports the LLaMA and GPT models, plus many more.

The Chinese company also revealed a "kilocard cluster" that puts 1,000 of its GPUs in harness, and claimed that China's Zhiyuan Research Institute has used it to train a 70 billion parameter model in 33 days, while a 130 billion parameter model would be done in 56 days.

China's major clouds are all-in on AI. As US chip sanctions bite, perhaps Moore Threads will be able to sell them some kilocard clusters. But with the likes of Baidu having stockpiled sufficient accelerators to sustain its AI chatbot for a year or more, and forbidden Nvidia kit constantly crossing the border, the Chinese GPU-maker will need to accelerate its accelerators just to catch up. ®

More about


Send us news

Other stories you might like