Review ATI launched its first consumer 512MB graphics board this week, and we've been evaluating it for the past few days. Nvidia announced a 512MB part not so long ago, with a 6800 Ultra variant based on the Quadro FX 4400 hardware they've had in the professional space for a wee while. ATI's new product has no immediate pro-level pairing, and with Nvidia looking like it will bring a 512MB 6800 GT to the table soon, we're beginning to see the arrival of true half-a-gig consumer products.
But why haven't we seen this kind of kit before? It's not for any technical reason. I know of engineering hardware at four of the major graphics IHVs with at least that much memory. The memory controller on nearly all of the previous generation of GPUs is able to support 512MB. Going back to the idea that Nvidia's 6800 Ultra is little more than a Quadro FX 4400 with a new BIOS and a couple of minor hardware changes, there's clearly been a need for very large frame buffers on professional-level hardware for quite some time.
Big buffers
The reason is memory pressure. Until PCI Express came along, you couldn't write data to off-card memory in anything other than full frame buffer-sized chunks, consuming CPU resources. That has downsides for professional 3D applications, given that you want to have plenty of spare CPU time for geometry processing and the preparation work that the GPU can't do for you. The CPU's time is better spent on these tasks than chucking massive frames of data around. You avoid that with big on-card frame buffers to reduce the number of times the card needs to use main memory.
Why 512MB? With a pro application like medical imaging you have a requirement to work with a couple of million vertices per frame, each defined colour at a location in 3D space. Given that the imaging application likely wants to work in 32-bit spatial precision and 32-bit colour, that's four bytes (32 bits) for each co-ordinate and another four bytes to define the colour.
A couple of million vertices at 16 bytes a vertex requires around 30MB just for the vertex data, per frame. Since it's a medical imaging application you're running, you might want to view a cross-section of the geometry using a 3D volume texture and do some sampling using a pair of smaller cubemaps. You also might want to shade the geometry too, using fragment programs which sample a decent amount of texture data. Anything larger than a 256 x 256 x 256 power of two volume texture is going to bust past 256MB, with your two million vertices worth of vertex buffers, leaving you struggling to fit texture data and the cubemaps into memory at the same time. If you're then antialiasing everything at high quality, performance goes out the window.