Still intent on improving single thread software performance, AMD has outlined some planned additions to the x86 instruction set that will appear in chips shipping in 2009.
The SSE5 extensions should make life easier on software developers and lead to rather dramatic performance gains. In particular, AMD expects the tweaks to boost code used in the high performance computing, multimedia and security arenas. Customers will see the extensions in AMD's new "Bulldozer" core-based chips that arrive in 2009, first for PC chips and then for server chips.
AMD and Intel have turned to adding more cores per chip to improve their products' performance rather than amping up GHz as in the past. This shift, caused by heat issues, means that developers need to write more complex multi-threaded software than can spread well across all of the cores. The software industry, however, is moving relatively slowly with these efforts, leaving tons of single-threaded code that could use a performance aid one way or another.
According to AMD, the new extensions will bring a couple of major breakthroughs.
For one, AMD will follow the RISC crowd with support for 3-Operand Instructions - up from two. So, unlike in the past where you would do A plus B and then have to store the result of the operation in A or B, developers can now store the result in a third location. This should reduce the total number of instructions needed to perform certain tasks and require less effort on the part of developers to keep track of registers.
The support for 3-Operand Instructions allows AMD to roll out a "fused multiply accumulate" instruction as well. This melds multiplication and addition to permit "iterative calculations with one instruction."
"It is basically taking two consecutive operations that occur very often in sequence and just making them a single operation in the instruction set," said Michael Frank, an AMD fellow.
With the extensions, AMD has seen up to a 5x performance gain in AES (Advanced Encryption Standard) encryption and a 30 per cent boost for DCT (discrete cosine transform), a mathematical operation used with audio and video codecs.
AMD has released the specification for the new instructions here.