Torvalds intentionally complicates his use of indentation in Linux Kconfig
Paramount penguin forces more robust whitespace handling
Linux kernel supremo Linus Torvalds has made the use of indentation in kernel config files more ambiguous – intentionally to weed out inferior parsers.
Kernel 6.9-rc4, the latest release candidate for the next version of the Linux kernel, came out yesterday. Among the usual drivers and bug fixes, it contains some more tweaks for bcachefs, as well as some mitigations against the recently-uncovered Spectre-style Native Branch History Injection data leaks.
However, the change that brought the most amusement to the face of the Reg FOSS desk is a configuration file change from Linus himself, titled "Kconfig: add some hidden tabs on purpose." He switched a space indent to a tab indent to catch out poor-quality parsers.
Specifically, in this block of text in the kernel source...
default 12 if PAGE_SIZE_4KB default 13 if PAGE_SIZE_8KB default 14 if PAGE_SIZE_16KB default 15 if PAGE_SIZE_32KB default 16 if PAGE_SIZE_64KB default 18 if PAGE_SIZE_256KB
...the character between default
and the integer value is now an eight-character-wide tab indentation. It just happens to fall on the end of a tab boundary so it appears as one space character.
While Torvalds is famed for his robust approach to giving feedback in public, he has been working on it, and in 2018 took a break to help him to get the emotions in his emails under better control.
This change could be one example of this. The kernel commandant spotted one particular code change, commit d96c36004e31 which had a single purpose:
Fix FTRACE_RECORD_RECURSION_SIZE entry, replace tab with a space character. It helps Kconfig parsers to read file without error.
Kconfig is the configuration language used to control the kernel build system, and like many other off-side rule languages, it uses indentation to delimit blocks. Yes, significant whitespace, just like in Python, YAML, and many other programming and configuration languages. Love it or hate it, you can't escape it.
In this change, he is intentionally making the use of indentation in kernel build configuration files more complicated, in order to force the authors of tools which parse such files to improve their game. As he explains:
Let's make sure it gets fixed. Because if you can't parse tabs as whitespace, you should not be parsing the kernel Kconfig files.
This seems to us to be an instance of Postel's Law, which Postel enshrined in RFC 761 in 1980: Be liberal in what you accept, and conservative in what you send. Many indentation-marked languages have a recommended style, such as Python's PEP-008, which says very clearly and distinctly:
Use 4 spaces per indentation level.
But in fact, although the specification is strict, the interpreter will happily accept different numbers of spaces, or tabs in some lines and spaces in others, so long as developers are consistent about how they use them. This is what Linus wants to see, and he quite rightly regards as broken whatever parsing tool it was that failed when it encountered tabs instead of spaces.
- Linux 6.9 will be the first to top ten million Git objects
- Linus Torvalds flames Google kernel contributor over filesystem suggestion
- Long-term support for Linux kernels is about to get a lot shorter
- Fed-up Torvalds suggests disabling AMD's 'stupid' performance-killing fTPM RNG
- Linux has nearly half of the desktop OS Linux market
So, rather than a savagely critical email response, he is knowingly and with malice aforethought using more complicated indentation in order to expose tools that can't handle it. It should weed out the weaker tools, leaving only the fitter, better-adapted ones… while not publicly hurting anyone's feelings.
Think of it as evolution in action. ®