This article is more than 1 year old

Intel adds fresh x86 and vector instructions for future chips

Some big changes are afoot

Intel has revealed two sets of extensions coming to the x86 instruction set architecture, one to boost the performance of general purpose code and the second to provide a common vector instruction set for future chips.

Some of the details were revealed on Intel’s developer website, showing the Advanced Performance Extensions (Intel APX) broadening the x86 instruction set with access to more registers and other features aimed at improving general-purpose performance. Advanced Vector Extensions 10 (Intel AVX10), meanwhile, is described as a “modern vector instruction set architecture” to be supported across future Intel processors.

APX represents what Intel is pitching as a big move for the future of its architecture. Its chief feature is a doubling of the number of general purpose registers from 16 to 32. Having more registers means there is less need to juggle values around, and this is one way that Intel claims it will increase performance.

Specifically, it will allow the compiler to keep more values in registers, such that code taking advantage of APX may require 10 percent fewer loads from memory and potentially more than 20 percent fewer stores than the same code compiled for the existing instruction set, Intel claims.

This means that the CPU spends more time doing calculations instead of moving data around, while register accesses are also faster and consume less power than complex load and store operations.

The new general purpose registers are XSAVE-enabled, meaning they can be automatically saved and restored by XSAVE/XRSTOR sequences during context switches, Intel says. Additional XSAVE area is not required for this, as the registers make use of the space previously allocated for the registers used with the now deprecated Intel MPX extensions.

APX also adds conditional forms of the load, store, and compare/test instructions, intended to combat the performance hit applications can take from conditional branch mispredictions. These are implemented via EVEX prefix extensions of existing legacy instructions.

According to Intel, developers can take advantage of APX by recompiling code, and source code changes are not expected to be required.

We asked Intel when its processor chips would implement the new APX instructions, and will update this article if we get a response.

AVX10, according to Intel, is the first major new vector instruction set update since the introduction of AVX-512. It is intended to provide a common converged vector instruction set across all Intel architectures and thus will be supported on all future processors, including performance cores (P-cores) and energy efficient cores (E-cores).

AVX10 is based on the Intel AVX-512 feature set and will support all instruction vector lengths (128, 256, and 512), as well as scalar and opmask instructions.

However, it appears that the “converged” version of AVX10 that will be common across all Intel processors will have a maximum vector length of 256 bits and 32-bit opmask registers. This is referred to as Intel AVX10/256.

Support for 512-bit vector and 64-bit opmask registers will continue to be offered on some P-core processors “for heavy vector compute applications that can leverage the additional vector length.” This is referred to as Intel AVX10/512.

While this might sound a little confusing, it appears that Intel wants to simplify developer support for vector instructions by having a baseline level of support across all chips for code that benefits from this, such as AI processing.

To this end, AVX10 will also introduce version-based instruction set enumeration, which is a fancy way of saying that all Intel chips with a given AVX10 version number will support the same features and instructions.

Developer code will only need to check three fields, according to Intel: A CPUID feature bit indicating that AVX10 is supported, the AVX10 version number, and a bit indicating the maximum supported vector length.

According to Intel, the Granite Rapids server chips due next year will serve as a transition point from AVX-512 to AVX10. These will feature AVX10 Version 1, which will not include the new 256-bit vector instructions.

AVX10 Version 2 will include the 256-bit instruction forms plus extra instructions covering new AI data types and conversions, data movement optimizations, and standards support ®

More about

More about

More about

TIP US OFF

Send us news


Other stories you might like