Kernel-memory-leaking Intel processor design flaw forces Linux, Windows redesign
Speed hits loom, other OSes need fixes
Final update A fundamental design flaw in Intel's processor chips has forced a significant redesign of the Linux and Windows kernels to defang the chip-level security bug.
Programmers are scrambling to overhaul the open-source Linux kernel's virtual memory system. Meanwhile, Microsoft is expected to publicly introduce the necessary changes to its Windows operating system in an upcoming Patch Tuesday: these changes were seeded to beta testers running fast-ring Windows Insider builds in November and December.
Crucially, these updates to both Linux and Windows will incur a performance hit on Intel products. The effects are still being benchmarked, however we're looking at a ballpark figure of five to 30 per cent slow down, depending on the task and the processor model. More recent Intel chips have features – such as PCID – to reduce the performance hit. Your mileage may vary.
PostgreSQL SELECT 1 with the KPTI workaround for Intel CPU vulnerability https://t.co/N9gSvML2Fo— The Register (@TheRegister) January 2, 2018
Best case: 17% slowdown
Worst case: 23%
Similar operating systems, such as Apple's 64-bit macOS, will also need to be updated – the flaw is in the Intel x86-64 hardware, and it appears a microcode update can't address it. It has to be fixed in software at the OS level, or go buy a new processor without the design blunder.
Details of the vulnerability within Intel's silicon are under wraps: an embargo on the specifics is due to lift early this month, perhaps in time for Microsoft's Patch Tuesday next week. Indeed, patches for the Linux kernel are available for all to see but comments in the source code have been redacted to obfuscate the issue.
However, some details of the flaw have surfaced, and so this is what we know.
The fix is to separate the kernel's memory completely from user processes using what's called Kernel Page Table Isolation, or KPTI. At one point, Forcefully Unmap Complete Kernel With Interrupt Trampolines, aka FUCKWIT, was mulled by the Linux kernel team, giving you an idea of how annoying this has been for the developers.
Whenever a running program needs to do anything useful – such as write to a file or open a network connection – it has to temporarily hand control of the processor to the kernel to carry out the job. To make the transition from user mode to kernel mode and back to user mode as fast and efficient as possible, the kernel is present in all processes' virtual memory address spaces, although it is invisible to these programs. When the kernel is needed, the program makes a system call, the processor switches to kernel mode and enters the kernel. When it is done, the CPU is told to switch back to user mode, and reenter the process. While in user mode, the kernel's code and data remains out of sight but present in the process's page tables.
Think of the kernel as God sitting on a cloud, looking down on Earth. It's there, and no normal being can see it, yet they can pray to it.
These KPTI patches move the kernel into a completely separate address space, so it's not just invisible to a running process, it's not even there at all. Really, this shouldn't be needed, but clearly there is a flaw in Intel's silicon that allows kernel access protections to be bypassed in some way.
The downside to this separation is that it is relatively expensive, time wise, to keep switching between two separate address spaces for every system call and for every interrupt from the hardware. These context switches do not happen instantly, and they force the processor to dump cached data and reload information from memory. This increases the kernel's overhead, and slows down the computer.
Your Intel-powered machine will run slower as a result.
How can this security hole be abused?
At best, the vulnerability could be leveraged by malware and hackers to more easily exploit other security bugs.
Specifically, in terms of the best-case scenario, it is possible the bug could be abused to defeat KASLR: kernel address space layout randomization. This is a defense mechanism used by various operating systems to place components of the kernel in randomized locations in virtual memory. This mechanism can thwart attempts to abuse other bugs within the kernel: typically, exploit code – particularly return-oriented programming exploits – relies on reusing computer instructions in known locations in memory.
If you randomize the placing of the kernel's code in memory, exploits can't find the internal gadgets they need to fully compromise a system. The processor flaw could be potentially exploited to figure out where in memory the kernel has positioned its data and code, hence the flurry of software patching.
However, it may be that the vulnerability in Intel's chips is worse than the above mitigation bypass. In an email to the Linux kernel mailing list over Christmas, AMD said it is not affected. The wording of that message, though, rather gives the game away as to what the underlying cockup is:
AMD processors are not subject to the types of attacks that the kernel page table isolation feature protects against. The AMD microarchitecture does not allow memory references, including speculative references, that access higher privileged data when running in a lesser privileged mode when that access would result in a page fault.
A key word here is "speculative." Modern processors, like Intel's, perform speculative execution. In order to keep their internal pipelines primed with instructions to obey, the CPU cores try their best to guess what code is going to be run next, fetch it, and execute it.
It appears, from what AMD software engineer Tom Lendacky was suggesting above, that Intel's CPUs speculatively execute code potentially without performing security checks. It seems it may be possible to craft software in such a way that the processor starts executing an instruction that would normally be blocked – such as reading kernel memory from user mode – and completes that instruction before the privilege level check occurs.
That would allow ring-3-level user code to read ring-0-level kernel data. And that is not good.
The specifics of the vulnerability have yet to be confirmed, but consider this: the changes to Linux and Windows are significant and are being pushed out at high speed. That suggests it's more serious than a KASLR bypass.
Also, the updates to separate kernel and user address spaces on Linux are based on a set of fixes dubbed the KAISER patches, which were created by eggheads at Graz University of Technology in Austria. These boffins discovered [PDF] it was possible to defeat KASLR by extracting memory layout information from the kernel in a side-channel attack on the CPU's virtual memory system. The team proposed splitting kernel and user spaces to prevent this information leak, and their research sparked this round of patching.
Their work was reviewed by Anders Fogh, who wrote this interesting blog post in July. That article described his attempts to read kernel memory from user mode by abusing speculative execution. Although Fogh was unable to come up with any working proof-of-concept code, he noted:
My results demonstrate that speculative execution does indeed continue despite violations of the isolation between kernel mode and user mode.
It appears the KAISER work is related to Fogh's research, and as well as developing a practical means to break KASLR by abusing virtual memory layouts, the team may have somehow proved Fogh right – that speculative execution on Intel x86 chips can be exploited to access kernel memory.
The bug will impact big-name cloud computing environments including Amazon EC2, Microsoft Azure, and Google Compute Engine, said a software developer blogging as Python Sweetness in this heavily shared and tweeted article on Monday:
There is presently an embargoed security bug impacting apparently all contemporary [Intel] CPU architectures that implement virtual memory, requiring hardware changes to fully resolve. Urgent development of a software mitigation is being done in the open and recently landed in the Linux kernel, and a similar mitigation began appearing in NT kernels in November. In the worst case the software fix causes huge slowdowns in typical workloads.
There are hints the attack impacts common virtualisation environments including Amazon EC2 and Google Compute Engine...
Microsoft's Azure cloud – which runs a lot of Linux as well as Windows – will undergo maintenance and reboots on January 10, presumably to roll out the above fixes.
Amazon Web Services also warned customers via email to expect a major security update to land on Friday this week, without going into details.
There were rumors of a severe hypervisor bug – possibly in Xen – doing the rounds at the end of 2017. It may be that this hardware flaw is that rumored bug: that hypervisors can be attacked via this kernel memory access cockup, and thus need to be patched, forcing a mass restart of guest virtual machines.
A spokesperson for Intel was not available for comment. ®
Updated to add
The Intel processor flaw is real. A PhD student at the systems and network security group at Vrije Universiteit Amsterdam has developed a proof-of-concept program that exploits the Chipzilla flaw to read kernel memory from user mode:
The Register has also seen proof-of-concept exploit code that leaks a tiny amount of kernel memory to user processes.
Finally, macOS has been patched to counter the chip design blunder since version 10.13.3, according to operating system kernel expert Alex Ionescu. And it appears 64-bit ARM Linux kernels will also get a set of KAISER patches, completely splitting the kernel and user spaces, to block attempts to defeat KASLR. We'll be following up this week.
Check out our summary of the processor bug, here, now that full details are known. Bear in mind there are two flaws at play here: one called Meltdown that mostly affects Intel, and what the above article is all about, and another one called Spectre that affects Intel, AMD, and Arm cores.
See our analysis of Intel's response here.
Additional reporting by John Leyden