Here's a thought experiment proposed by the Linux Foundation today: If you had to start from scratch, what would it cost to create a Linux distribution?
The short answer is: about $1.4bn for the Linux kernel and about $10.8bn for the Red Hat Fedora 9 development release, the latest one out this summer.
Because Linux is an open source and sometimes a volunteer effort, both by individuals and by the corporations that have contributed to the Linux cause, it is difficult to estimate what it costs to put together a Linux distribution. Back in June 2001, David Wheeler, a computer scientist in the open source world (but not the David Wheeler who got the world's first PhD in computer science in 1951 and went on to invent the programming subroutine), posted a study calculating the cost of developing a Linux distribution by counting the lines of code in the distro, seeing how many person-years that code would take to develop, and plunking down salaries, benefits, and overhead to cover those costs. Just as you would have to do to create a proprietary operating system from scratch.
The methodology for estimating the programming effort and cost is more complex than the elements above, and Wheeler used slightly different models to come up with costs. The idea - if you can bear with it without getting overly critical - was to attempt to put some kind of economic value on the hard work that open source programmers do.
Back in 2001, Wheeler estimated that it would cost $600m to create Red Hat 6.2, released in 2000 - it had 17 million lines of code and would take about 4,500 person-years to create. His objective six years ago, when these calculations were updated, was to peg the development cost inherent in the Red Hat 7.1 distribution, which had over 30 million lines of code and was estimated to require about 8,000 person-years of effort, in total.
Using prevailing salaries and overhead for US-based programmers, Wheeler calculated that Red Hat 7.1 had a development cost of over $1bn - again, that is if you started from scratch and if you got all the code right the first time, which no software development project does.
The Linux Foundation, which is charged with championing the cause of Linux and open source software, decided it was time to run the latest Red Hat development release, Fedora 9, through the Wheeler methodology and see how the cost of a Linux distro has changed in the past six years. You can see the resulting study, Estimating the Total Development Cost of a Linux Distribution, right here.
Using the same method of counting software lines of code and then using some equations to reckon the effort involved in creating the code, the Linux Foundation came to the conclusion that in 2008 dollars, Fedora 9 would cost $10.8bn to develop. Fedora 9 has 204.5 million lines of code, and the methodology used by Wheeler estimates that this code has 59,389 person-years of effort. With a prevailing wage for programmers in the US at $75,662, and slapping on overhead equal to 2.4 times the salary to cover benefits, office overhead, and other charges, yields that $10.8bn.
Interestingly, if you adjust the figures for the prevailing wage in 2000 dollars that was used in the Red Hat 6.2 and 7.1 calculations ($56,286), you would come up with a 2000 dollar figure of just over $8bn for the cost of developing Fedora 9. That extra $7bn is one way to reckon the incremental value of all of the enhancements that have made it into the Fedora Linux distribution in the past seven years.
The Linux Foundation also calculated the cost of creating the Linux kernel, and at Wheeler's suggestion from back in 2002, used a slightly different cost method for this. The Linux 2.6.25 kernel used in Fedora has 6.8 million lines of code, and using this different methodology - called the intermediate COCOMO model - reckoned it would take 7,557 person-years to make the kernel at a cost of just under $1.4bn.
As the Linux Foundation report fully admits, there are plenty of limits to the analysis that it has put together. First and foremost, the methodologies only really take into account net additions to software in the Linux distribution, and do not take into account the economic cost of lots of code that is created and then not used or eventually removed from the stack. The methodology used for the calculations was also based on research done on how proprietary software inside a company is created, not for open source, collaborative development efforts.
Then there is the fact that Linux has many distributions with many different packages and repositories; there is no such thing as "Linux", technically, unless you are talking about a kernel. (But people and corporations talk about Linux as an operating system because that is how, as end users, they see it in their lives.) The study also doesn't take a look at the bloat inside of a Linux distro - such as old drivers - that no one uses, and it assumes that development is done inside the United States. Not exactly a fair assumption in 2008.
The important thing is that Linux distributions have a very large inherent value, and people should have some kind of sense of what it is because this value will never be reflected in the size of Linux software support sales. These numbers cooked up by Wheeler and the Linux Foundation show people not directly involved in the open source at least some kind of estimate of the energy (not necessarily money, but certainly time) that is being put into open source projects like Linux.
If programmers are contributing to Linux for the good of the cause, they are paying an opportunity cost - even if we don't have to. They could spend their time coding for money instead, or not doing any coding at all. And for the thousands of programmers who contribute to the Linux kernel and distro efforts, such estimates are a simple way of saying a simple thing: Thank you. ®