And where it all goes wrong
Back in around 1993, Intel introduced its Advanced Programmable Interrupt Controller (APIC), which, as its name suggests, manages interrupts coming into a processor. Interrupts are electrical pulses generated by the hardware telling the CPU to stop what it's doing, and sort out this urgent thing instead. It might be a countdown timer that's hit zero, or a hard drive finishing a data transfer command, etc. Most of the time, a driver is ultimately summoned to deal with the interrupt before the CPU continues with what it was doing.
The APIC is a split design: there's a local APIC for each processor core on the motherboard, and usually one IO APIC. The IO APIC hooks up with the hardware, and routes interrupt signals to the local APICs, which decide whether or not to interrupt their core and also pass messages between cores.
These local APICs were introduced as discrete chips (the 82489DX). Starting with the Pentium P54C in 1994, the local APICs were built into the actual processor. The local APICs' control registers – internal variables that configure how they work – were mapped into the processor's physical memory map starting from 0xFEE00000 and 0xFEE01000. That means operating system software could find and talk to a core's local APIC at those memory addresses.
When the Pentium Pro (a P6 family chip) arrived in 1995, Intel allowed kernel-level developers to reprogram the local APIC so that it would appear elsewhere in physical memory. This was handy for moving the local APIC out of the way of low-level software that expected to use that high 0xFEE00000 address for something else.
By using the
wrmsr instruction, the operating system developer can configure a processor core to move its local APIC to anywhere in memory. Just write the new physical memory address to the processor's model specific register 0x001b.
Anywhere in memory, you say?
Yes, anywhere in physical memory. Like say, where the SMM code stores its hidden private data.
By mapping the local APIC to 0x1FF80000, or thereabouts, it will overlap the SMM's private chunk of RAM. Now when the SMM is triggered into action by a special interrupt called an SMI, the CPU will stop what it's doing, switch to the SMM in ring -2, and execute its code. While running, the SMM's interrupt handler will want to use its private data in memory – except, it won't be accessing its hidden chunk of RAM, it'll be reading from the local APIC's internal registers instead. And we can control those registers, and use them to feed specially crafted data into the SMM to hijack it.
The SMM code has no idea this is happening; it is oblivious to the deception taking place. Here's how it'll work, step by step, taken from Domas' slides [PDF]: