Google Native Client: The web of the future - or the past?

This time, it's Mozilla v Google


A sandbox twenty years in the making

Native Client is inspired by a research paper published by a team of academics at the University of California, Berkeley in the early 90s. The paper described a means of sandboxing native code known as "software-based fault isolation". In essence, this system used a special instruction sequence to ensure that machine code programs conformed to certain security properties, and then it could analyze the resulting code to determine whether those properties were indeed followed.

Brad Chen

Brad Chen

The trouble is that the technique wasn't feasible for use with the x86 instruction set used by modern Intel and Intel-compatible processors. But Google changed that, in part by tapping the instruction set's rarely-used "segment registers". Expanding on a second paper from MIT and Harvard academics Stephen McCamant and Greg Morrisett, Google produced a version of the system that not only worked with x86, but did so with low overhead, letting native code run fairly close to its original speed. "They took the idea and really made it real," Harvard professor Morrisett tells The Register. "They have a very good handle now on how to do this stuff."

With the 32-bit x86 instruction set, Native Client uses the segment registers to restrict where in memory a program can read and write data and to ensure that a program doesn't jump to code outside a certain range of memory. But it also includes a modified compiler and a code verifier that work to keep code jumps in line.

An ordinary program will read a data value from memory into a register and then jump to the address that value represents. But with Native Client, the compiler performs a bit of arithmetic on that value before the jump to ensure it doesn't target bad instructions, and then the code verifier double-checks the compiler's work.

Among other things, the verifier must ensure that a program doesn't leap past the extra instructions inserted by the compiler. "You've added this arithmetic to make sure the jumps are in range. But the real issue is that if it's really clever, a program will arrange for a jump to jump past that arithmetic," says Morrisett. "You might protect one jump but not the next one."

Greg Morrisett

Greg Morrisett

Originally, it was difficult to write such a verifier for x86 because the lengths of the instructions vary. If the lengths aren't the same, it takes far too long to find rogue instructions. As you parse the sequence of raw bytes looking for these bad boys, you have to assume that they could start at any given byte.

But McCamant, then a grad student at MIT, realized this problem could be eased by "padding" the instructions with dummy bytes. "On a RISC machine [where all instructions are the same length], there's only one parse. But on an x86 machine, there can be multiple parses, and it was infeasible to check all of them," says Morrisett. "The trick was to force the instructions to be aligned in a certain way." And Google ran with the idea. In addition to boosting the performance of the system by way of the x86 segment registers, Google improved on the speed of McCamant's padding method.

The games Google plays

Like any other piece of software, Native Client will contain its own security holes, and according to Morrisett, this is a particular worry with the code verifier. In 2009, Google ran a contest to identify bugs in Native Client, and yes, some were found. But Morrisett now describes the system as "pretty robust".

The larger point is that in providing software-based fault isolation on x86, Native Client still maintains the speed of native code – or thereabouts. According to Morrisett, the sandbox adds only about 5 per cent overhead, which gives Native Client a significant advantage over JavaScript.

"It gives you a tremendous improvement in performance compared to other options for running code in the browser," he says. "It makes it possible to do serious rendering and serious number-crunching in the browser without a big performance hit."

Specifically, Native Client can provide more in the way of parallel execution than JavaScript, and it allows for much faster vector arithmetic. For Google, it's a way of opening the browser to 3D games, video editing, and other applications that require that extra bit of oomph, and such applications can be moved to the platform with relatively little effort. Existing native applications must be ported for use on Native Client, but developers needn't start from scratch. They can use their existing codebase.

"[Native Client] gives you a tremendous improvement in performance compared to other options for running code in the browser. It makes it possible to do serious rendering and serious number-crunching in the browser."

– Greg Morrisett

This proposition fits quite nicely with Chrome OS, the fledgling Google operating system that puts all applications inside the browser. With Chrome OS, running existing 3D games and other desktop applications isn't really an option. But the Native Client project pre-dates Google's operating system effort, and the ultimate goal is to bring a new breed of applications to the entire web.

"JavaScript now does integers well, but JavaScript engines haven't really gotten around to vector arithmetic yet, and that's where native code currently tends to have a big advantage...These things aren't important to all software, but when they are, it can mean one or two orders of magnitude in performance," says Google's Linus Upson.

"The sweet spot is 3D games. They do a lot of computation, and the codebase already exists in C++, so they don't have to be rewritten. It allows us to get a lot more fun and exciting games on the web."

The rub is that Native Client isn't the web – at least not yet. It will soon be an integral part of Google's browser and its browser-based operating system. But that's a little different.


Other stories you might like

  • Google sours on legacy G Suite freeloaders, demands fee or flee

    Free incarnation of online app package, which became Workplace, is going away

    Google has served eviction notices to its legacy G Suite squatters: the free service will no longer be available in four months and existing users can either pay for a Google Workspace subscription or export their data and take their not particularly valuable businesses elsewhere.

    "If you have the G Suite legacy free edition, you need to upgrade to a paid Google Workspace subscription to keep your services," the company said in a recently revised support document. "The G Suite legacy free edition will no longer be available starting May 1, 2022."

    Continue reading
  • SpaceX Starlink sat streaks now present in nearly a fifth of all astronomical images snapped by Caltech telescope

    Annoying, maybe – but totally ruining this science, maybe not

    SpaceX’s Starlink satellites appear in about a fifth of all images snapped by the Zwicky Transient Facility (ZTF), a camera attached to the Samuel Oschin Telescope in California, which is used by astronomers to study supernovae, gamma ray bursts, asteroids, and suchlike.

    A study led by Przemek Mróz, a former postdoctoral scholar at the California Institute of Technology (Caltech) and now a researcher at the University of Warsaw in Poland, analysed the current and future effects of Starlink satellites on the ZTF. The telescope and camera are housed at the Palomar Observatory, which is operated by Caltech.

    The team of astronomers found 5,301 streaks leftover from the moving satellites in images taken by the instrument between November 2019 and September 2021, according to their paper on the subject, published in the Astrophysical Journal Letters this week.

    Continue reading
  • AI tool finds hundreds of genes related to human motor neuron disease

    Breakthrough could lead to development of drugs to target illness

    A machine-learning algorithm has helped scientists find 690 human genes associated with a higher risk of developing motor neuron disease, according to research published in Cell this week.

    Neuronal cells in the central nervous system and brain break down and die in people with motor neuron disease, like amyotrophic lateral sclerosis (ALS) more commonly known as Lou Gehrig's disease, named after the baseball player who developed it. They lose control over their bodies, and as the disease progresses patients become completely paralyzed. There is currently no verified cure for ALS.

    Motor neuron disease typically affects people in old age and its causes are unknown. Johnathan Cooper-Knock, a clinical lecturer at the University of Sheffield in England and leader of Project MinE, an ambitious effort to perform whole genome sequencing of ALS, believes that understanding how genes affect cellular function could help scientists develop new drugs to treat the disease.

    Continue reading

Biting the hand that feeds IT © 1998–2022