FreeBSD can now boot in 25 milliseconds
On AWS Firecracker – but there are other new micro-VM engines around, too
Replacing a sort algorithm in the FreeBSD kernel has improved its boot speed by a factor of 100 or more… and although it's aimed at a micro-VM, the gains should benefit everyone.
MicroVMs are a hot area of technology R&D in the last half decade or so. The core idea is a re-invention of some of concepts and technology that IBM invented along with the hypervisor in the 1960s: designing OSes specifically to run as guests under another OS. This means building the OS specifically to run inside a VM, and to talk to resources provided by a specific hypervisor rather than to fake hardware.
This means that the guest OS needs next to no support for real hardware, just VirtIO drivers which talk directly to facilities provided by the host hypervisor. In turn, the hypervisor doesn't have to provide an emulated PCI bus, emulated power management, emulated graphics card, emulated network interface cards, and so on. The result is that the hypervisor itself can be much smaller and simpler.
The result of ruthlessly chopping down both the hypervisor, and the OS that runs inside it, is that both ends can be much smaller and simpler. That means that VMs can use much fewer resources, and start up much quicker.
At the moment, the commercial goal of this is providing "serverless" compute power. "Serverless" computing is marketing double-speak, really: of course there really are servers, somewhere in a datacenter. But rather than providing Infrastructure as a Service, the famed IaaS model, this is Function as a Service instead. The idea is that you don't need to know anything about the infrastructure: your program calls another program, and the management tooling spawns as many instances as needed to run that specific operation, return the result, and then delete the VMs used to run the calculations. You never need to know where it happened or how.
For the customer, it's good because it's fast and it's easy. For the providers, it's good because it means the resources are freed up again much more quickly, so they can reused immediately, which means supporting more customers on the same amount of hardware.
AWS is offering FaaS via a service called Lambda, after an arcane bit of functional programming terminology. Lambda is powered by Amazon's home-grown Firecracker hypervisor which also powers its Fargate serverless offering.
Firecracker is based on the Linux kernel's built-in KVM hypervisor: in itself, something of a departure, as up until then, AWS was based on the Xen hypervisor. This means it's inherently a Linux-on-Linux offering. That sounded like a challenge to FreeBSD kernel developer Colin Percival, as we reported on a year ago: he decided to get FreeBSD running on Firecracker. As with most of computing in general, though, the overall optimization process is: first, get it working at all; then, make it go fast.
According to his tweet earlier this week, his latest performance optimization is impressive: replacing a sort algorithm made part of the FreeBSD kernel startup process around a hundred times faster, bringing the kernel loading time down to an impressive 25 milliseconds. That's a quarter of one-tenth of a second.
@cperciva) August 20, 2023
This tweak is just the latest in a long series, which he described in much more detail a couple of days later. It describes the preliminary changes needed to get it booting at all: removing several initialization steps which assumed it was booting under Xen, then querying ACPI for the type and number of processors. That failed, as Firecracker doesn't provide ACPI. Then, initialization of one of the only bits of hardware it does emulate, a serial console, failed.
After the kernel was successfully starting, memory usage quickly became a problem: Firecracker defaults to assigning the guest a mere 128MB of RAM, due to an assumption which had to be changed. What follows is a whole laundry list of optimizations, each of which contributed a small time saving.
It's an interesting read, even if you're not super technical. Some of the steps change things that were quite reasonable choices for booting on dedicated hardware, which no longer make sense in a virtual environment where a machine is spawned, does some work, and is deleted again within a matter of a few seconds.
I believe Linux is at 75-80 ms for the same environment where I have FreeBSD booting in 25 ms.
When I started working on speeding up the boot process, the kernel took about 10 seconds to boot, so I have a kernel booting about 400x faster now than I did a few years ago.
For now, the optimized kernel is the FreeBSD 14 one, on x86-64, but work is underway to bring it to Arm64 as well — AWS is the biggest user of Arm servers in the world.
- Spotted in the wild: Chimera – a Linux that isn't GNU/Linux
- helloSystem 0.8: A friendly, all-graphical FreeBSD
- Double BSD birthday bash beckons – or triple, if you count MidnightBSD 3.0
- FreeBSD comes to Amazon's lightweight hypervisor
Firecracker is one of the higher-profile microVMs around, but there are others, and its success has inspired the QEMU developers to add a microvm virtual platform as well. Canonical developer Christian Erhardt has blogged about how to use this in Ubuntu, and online-code-development-environment vendor Hocus recently explained why it switched from Firecracker to the QEMU equivalent instead.
We can see a lot of potential uses for microVMs, not just in cloud scenarios. The ability to run a single program built for one OS on top of a totally different OS, without the overhead of running a full emulated environment all the time, could be very handy in all kinds of situations.
Containers are a very useful tool, but in containers you can only run binaries for the same host OS. Running anything else – such as Docker Linux containers on macOS – means that some emulation and a guest OS have been hidden away somewhere in the stack. The smaller that VM can be, and the fewer resources it uses, the better the overall performance, not only of the containers but of the whole machine. ®