A Tale of Four Kernels

The FreeBSD, GNU/Linux, Solaris, and Windows operating systems have kernels that provide comparable facilities. Interestingly, their code bases share almost no common parts, while their development processes vary dramatically. We analyze the source code of the four systems by collecting metrics in the areas of file organization, code structure, code style, the use of the C preprocessor, and data organization. The aggregate results indicate that across various areas and many different metrics, four systems developed using wildly different processes score comparably. This allows us to posit that the structure and internal quality attributes of a working, non-trivial software artifact will represent first and foremost the engineering requirements of its construction, with the influence of process being marginal, if any.

Arguments regarding the efficacy of open source products and development processes often employ external quality attributes [21], anecdotal evidence [17], or even plain hand waving [13]. Although considerable research has been performed on open source artifacts and processes [10,36,7,9,41,3,32], the direct comparison of open source products with corresponding proprietary systems has remained an elusive goal. The recent open-sourcing of the Solaris kernel and the distribution of large parts of the Windows kernel source code to research institutions has provided us with a window of opportunity to perform a comparative evaluation between the code of open source and proprietary systems.

Here I report on code quality metrics I collected from four large industrial-scale operating systems: FreeBSD, Linux, OpenSolaris, and the Windows Research Kernel (WRK). The main contribution of this research is the finding that there are no significant across-the-board code quality differences between four large working systems, which have been developed using various open-source and proprietary processes. An additional contribution involves the proposal of numerous code quality metrics for objectively evaluating software written in C. Although these metrics have not been empirically validated, they are based on generally accepted coding guidelines, and therefore represent the rough consensus of developers concerning desirable code attributes.

More here