This article is more than 1 year old
Microsoft releases open source bug-bomb in the rambling house of C
Checked C lands on Github
The zombie bugs in programs and libraries at the heart of the Internet's infrastructure often have the C programming language in common.
Microsoft Research now wants to add the kind of bounds-checking seen in C# to C, to help splat bugs like “buffer overruns, out-of-bounds memory accesses, and incorrect type casts,” in an add-on called Checked C.
It's an attempt to resolve a paradox in C: on the one hand, handling pointers directly makes for efficient, “close to the hardware” programming; on the other, pointer errors provide lots of dangerous vectors.
Microsoft notes that between 2010 and 2015, buffer overflows were behind as many as 16 per cent of the bugs added to the US National Vulnerability Database each year. The specification document lists the Windows and Linux operating systems, IE, Chrome and Safari browsers, Apache, OpenSSL, Bash, Ruby and PHP scripting languages and QuickTime as all suffering from buffer overflows in that period.
Released under the MIT License, the Checked C adds dynamic checking to the venerable language. Microsoft Research has published the full 140-plus page specification for Checked C here. In it, the company's David Tarditi explains that the extension “adds new pointer types and array types that are bounds-checked, yet layout-compatible with existing pointer and array types”.
It applies dynamic checking to enforce the integrity of memory access at runtime – something you can't get from static checking. Importantly, the extension is fully backwards-compatible, so an existing C program will work without rewrite.
Key to it is better handling of pointers in C programs. Checked C “allows programmers to better describe how they intend to use pointers and the range of memory occupied by data that a pointer points to,” MS Research explains at its project page.
Checked C currently has the following checked pointer types: a simple pointer without pointer arithmetic; a pointer to an array of known size; and a generalisation of the array pointer to an arbitrary memory area of varying size; and a checked array type. ®