Wednesday, November 22, 2017

Review: Languages and Code Quality in GitHub

The study is several years old, but was recently reprinted in the Communications of the ACM.  In it, they mined GitHub data for active open source projects, collecting the defect and development rates.  They classified the defects according to their type, and the development language according to their feature.  And they found that language choice matters, marginally.  Some types of bug are far more common, such as memory management bugs in C / C++.  Functional languages have the lowest rate; however, this analysis is only based on the commit history and does not also analyze development time, or differences in programmers.  So the takeaway is that language features do matter, but programmers just write buggy code.