Everyone cites that 'bugs are 100x more expensive to fix in production' research, but the study might not even exist

It's probably still true, though, says formal methods expert

"Software research is a train wreck," says Hillel Wayne, a Chicago-based software consultant who specialises in formal methods, instancing the received wisdom that bugs are way more expensive to fix once software is deployed.

Wayne did some research, noting that "if you Google 'cost of a software bug' you will get tons of articles that say 'bugs found in requirements are 100x cheaper than bugs found in implementations.' They all use this chart from the 'IBM Systems Sciences Institute'... There's one tiny problem with the IBM Systems Sciences Institute study: it doesn't exist."

Laurent Bossavit, an Agile methodology expert and technical advisor at software consultancy CodeWorks in Paris, has dedicated some time to this matter, and has a post on GitHub called "Degrees of intellectual dishonesty". Bossavit referenced a successful 1987 book by Roger S Pressman called Software Engineering: a Practitioner's Approach, which states: "To illustrate the cost impact of early error detection, we consider a series of relative costs that are based on actual cost data collected for large software projects [IBM81]."

The reference to [IBM81] notes that the information comes from "course notes" at the IBM Systems Sciences Institute. Bossavit discovered, though, that many other publications have referenced Pressman's book as the authoritative source for this research, disguising its tentative nature.

Bossavit took the time to investigate the existence of the IBM Systems Science Institute, concluding that it was "an internal training program for employees." No data was available to support the figures in the chart, which shows a neat 100x the cost of fixing a bug once software is in maintenance. "The original project data, if any exist, are not more recent than 1981, and probably older; and could be as old as 1967," said Bossavit, who also described "wanting to crawl into a hole when I encounter bullshit masquerading as empirical support for a claim, such as 'defects cost more to fix the later you fix them'."

Do software defects cost more to fix, the later they are discovered? "I think the body of research so far tentatively points in that direction, depending on how you interpret 'late-stage', 'bugs', and 'more expensive', said Wayne. "Certain bugs take more time to fix (and cause more damage) than others, and said bugs tend to be issues in the design."

Here is a 2016 paper [PDF] whose authors "examined 171 software projects conducted between 2006 and 2014," all of which used a methodology called the Team Software Process. The researchers concluded that "the times to resolve issues at different times were usually not significantly different."

Wayne is as concerned with the state of software research as with the defect question itself. He observed that papers such as the one cited above "use different definitions of defect," making it hard to draw conclusions. He said he is a proponent of Empirical Software Engineering (ESE), using evidence to learn about what works in software development, but said that "the academic incentive structures are not aligned in a way that would give industry actionable information. There's much more incentive to create new models and introduce new innovations than do the necessary 'gruntwork' that would be most useful."

He suggested focusing on what "empirical research overwhelmingly shows," which is that "code review is a good way to find software bugs and spread software knowledge. It also shows that shorter iteration cycles and feedback loops lead to higher quality software than long lead times."

The role of the "IBM Systems Sciences Institute" in cementing the authority of research that might not exist is a reminder of the importance of primary sources, which can be hard to discover in the echo chamber of the internet.

Right on cue, into our inbox popped a bit of "research" from a PR agency concerning the revenue of the the top five cloud vendors, based on a "Statista survey". Statista is not primarily a research company, however. Instead it "consolidates statistical data on over 80,000 topics from more than 22,500 sources," according to its own description.

The research mentioned did not come from Statista, but from Cloud Wars. Citing Statista as the source is like attributing a statement discovered in a Google search as having the authority of Google. The risk of confusion like this is that a poor source can be promoted to a better one (and that is not intended to suggest that the Cloud Wars data is inaccurate). ®

Send us news

RIP Fred 'Mythical Man-Month' Brooks: IBM guru of software project management

Turing Award winner who helped spread the eight-bit byte

IBM updates desktop mainframe emulator

For just $5,500 and the cost of a quad-core x86 box, z/OS 16 for test and dev on the desktop can be yours

Oracle brews Java 19. Mmmm, kinda tastes like RISC-V

Upstart CPU integration, incremental improvements – just the way cautious corporate customers like it

Go programming language arrives at security warnings that are useful

Low-noise tool hopes to highlight vulnerabilities imported into projects

Heroku to delete inactive accounts, shut down free tier

Move over, GitLab, this purge begins in October

W3C's planned transition to HTTPS stymied by legacy laggards

Timing for plan to redirect HTTP requests now indeterminate to allow for further feedback

GitLab versus The Zombie Repos: An old plot needs a new twist

Git back, git back, git back to where your files belong

Data processor Hazelcast goes serverless: Viridian platform hits beta

For the real-time developer that really hates thinking about infrastructure and provisioning

Meta approves four programming languages for workers and developers

Rust never sleeps and is henceforth welcome for backend services and CLI tools

Node.js prototype pollution is bad for your app environment

Boffins find common code constructs that may be exploitable to achieve remote code execution

Google's Flutter app development framework now stable across platforms

v3.0 has also gained a Casual Games Toolkit

Microsoft delays controversial ban on paid-for open source, WebKit in app store

Embrace, extend, excuses