This article is more than 1 year old
Nvidia watches Brit upstart Graphcore swing into rear-view mirror waving beastly second-gen AI chip hardware
Colossus Mk2 boasts of even more transistors than the top GPU
Video British AI chipmaker Graphcore has announced a new series of hardware products based on its latest second-generation Intelligence Processing Unit (IPU) known as the Colossus Mk2 GC200.
Hot on the heels of its top competitor, Nvidia, which launched its AI-focused A100 GPU in May, the Bristolian upstart is also keen to flex its machine-learning muscles with its own 7nm, TSMC-fabricated chip.
CEO Nigel Toon claimed the Colossus Mk2 was the world's most complex processor and said the hardware could be scaled up to build systems from single-rack servers to supercomputers:
The GC200 packs 59.4 billion transistors onto a 823mm2 die – a bit more than the 54 billion transistors on Nvidia's 826mm2 A100. Graphcore's chip also contains 1,472 processor cores, 8,832 parallel program threads, and 900MB of on-chip memory. Colossus Mk2 supports FP32 and FP16 precisions.
Suffering ceepie-geepies! Do we need a new processor architecture?READ MORE
Four of these IPUs are fitted onto Graphcore's IPU-Machine M2000 to build a rack that has one petaFLOPS of compute that can access up to 450GB of off-chip memory, and reach up to 180TB/s of performance using its Exchange-Memory communication link. Each machine is priced at $32,450.
El Reg has asked Graphcore to tell us how it came up with 180TB/s and we're waiting for an explanation. To build an IPU server [PDF], you'll need eight IPU-Machine M2000s, which will cost $259,600.
If that's still not enough compute, then double it to 16 IPU-Machine M2000 to get what Graphcore calls a IPU-POD64 aimed at data centres to get 16 petaFLOPS of performance. These pods should provide more than enough compute to train the largest of AI models with billions of parameters.
The IPU-POD64k is the largest Graphcore product, made by stacking 1,024 IPU-POD64 19-inch racks. Together these create an AI supercomputer, capable of 16 exaFLOPS of performance.
Graphcore said the IPU-Machine M2000 and IPU-POD64 systems are available to pre-order now, and should reach customers from Q4 2020 onward. It has also worked with Cirrascale Cloud Services to test its IPU-POD systems in the cloud for some early customers.
Its systems support most popular machine-learning frameworks, including PyTorch, TensorFlow, ONNX, and PaddlePaddle.
You can find more analysis and commentary on the chip over at our sister site, The Next Platform. ®