This article is more than 1 year old

Universal Unix tool AWK gets Unicode support

From original author Brian Kernighan, one of the original Unix team

In Unix terms, this news is akin to Moses appearing and announcing an amendment to the 10 commandments.

AWK, a programming language for analyzing text files, is a core part of the Unix operating system, including Linux, all the BSDs and others. For an OS to be considered POSIX compliant, it must include AWK. AWK first appeared in 1977 and was included in Version 7 UNIX in 1979 – the last version of UNIX from Bell Labs, before AT&T turned it into a commercial product.

What is notable about the tool gaining Unicode support is not so much the feature itself, but who wrote it: Canadian computer scientist Brian Kernighan.

AWK's name is an acronym for its three original developers: Turing Award winner Alfred Aho, Peter Weinberger and Brian Kernighan. Professor Kernighan is also the "K" in "K&R C", as in the original, classic, 1978 book The C Programming Language, written by Professor Kernighan and the late, great Dennis Ritchie.

Indeed the book specified not only a version of the C language, now known as C78, but even an indentation style. Such is its influence that in old Unix hacker circles, the book is sometimes called "the old testament" and the indentation "the one true brace style".

There are other versions of AWK, but this is the original, known as the One True AWK. The code change is described on Github under the modest description "Add BWK's email." The Professor modestly notes:

Once I figure out how (and do some more checking, I will try to submit a pull request. I wish I understood git better, but in spite of your help, I still don't have a proper understanding, so this may take a while.

He has The Reg FOSS desk's sympathies. This vulture is a stripling of 54 and remains unable to wrap his head around the aptly named Git, whereas Prof Kernighan is 80.

Kernighan also came up with the name "UNIX" and invented the "Hello, world" programming language demonstration, which was for the B programming language, a forerunner of C, although of that language, he maintains:

I had no part in the birth of C, period. It's entirely Dennis Ritchie's work. I wrote a tutorial on how to use C for people at Bell Labs, and I twisted Dennis's arm into writing a book with me.

Prof Kernighan has written a number of other notable books, including in recent years The Go Programming Language (2015), Understanding the Digital World (2017), and Unix: A History and a Memoir (2019).

It's important to remember that software such as Unix are not holy writ, handed down inviolable from historical times. Most of the people that designed, implemented and shaped them are still with us. In this case, the code change was actually a few months ago, but was only noticed by the wider world thanks to a new interview with Prof Kernighan that was just released. ®

Youtube Video

More about


Send us news

Other stories you might like