Software

Everyone cites that 'bugs are 100x more expensive to fix in production' research, but the study might not even exist

It's probably still true, though, says formal methods expert


"Software research is a train wreck," says Hillel Wayne, a Chicago-based software consultant who specialises in formal methods, instancing the received wisdom that bugs are way more expensive to fix once software is deployed.

Wayne did some research, noting that "if you Google 'cost of a software bug' you will get tons of articles that say 'bugs found in requirements are 100x cheaper than bugs found in implementations.' They all use this chart from the 'IBM Systems Sciences Institute'... There's one tiny problem with the IBM Systems Sciences Institute study: it doesn't exist."

Laurent Bossavit, an Agile methodology expert and technical advisor at software consultancy CodeWorks in Paris, has dedicated some time to this matter, and has a post on GitHub called "Degrees of intellectual dishonesty". Bossavit referenced a successful 1987 book by Roger S Pressman called Software Engineering: a Practitioner's Approach, which states: "To illustrate the cost impact of early error detection, we consider a series of relative costs that are based on actual cost data collected for large software projects [IBM81]."

The reference to [IBM81] notes that the information comes from "course notes" at the IBM Systems Sciences Institute. Bossavit discovered, though, that many other publications have referenced Pressman's book as the authoritative source for this research, disguising its tentative nature.

Bossavit took the time to investigate the existence of the IBM Systems Science Institute, concluding that it was "an internal training program for employees." No data was available to support the figures in the chart, which shows a neat 100x the cost of fixing a bug once software is in maintenance. "The original project data, if any exist, are not more recent than 1981, and probably older; and could be as old as 1967," said Bossavit, who also described "wanting to crawl into a hole when I encounter bullshit masquerading as empirical support for a claim, such as 'defects cost more to fix the later you fix them'."

Do software defects cost more to fix, the later they are discovered? "I think the body of research so far tentatively points in that direction, depending on how you interpret 'late-stage', 'bugs', and 'more expensive', said Wayne. "Certain bugs take more time to fix (and cause more damage) than others, and said bugs tend to be issues in the design."

Here is a 2016 paper [PDF] whose authors "examined 171 software projects conducted between 2006 and 2014," all of which used a methodology called the Team Software Process. The researchers concluded that "the times to resolve issues at different times were usually not significantly different."

Wayne is as concerned with the state of software research as with the defect question itself. He observed that papers such as the one cited above "use different definitions of defect," making it hard to draw conclusions. He said he is a proponent of Empirical Software Engineering (ESE), using evidence to learn about what works in software development, but said that "the academic incentive structures are not aligned in a way that would give industry actionable information. There's much more incentive to create new models and introduce new innovations than do the necessary 'gruntwork' that would be most useful."

He suggested focusing on what "empirical research overwhelmingly shows," which is that "code review is a good way to find software bugs and spread software knowledge. It also shows that shorter iteration cycles and feedback loops lead to higher quality software than long lead times."

The role of the "IBM Systems Sciences Institute" in cementing the authority of research that might not exist is a reminder of the importance of primary sources, which can be hard to discover in the echo chamber of the internet.

Right on cue, into our inbox popped a bit of "research" from a PR agency concerning the revenue of the the top five cloud vendors, based on a "Statista survey". Statista is not primarily a research company, however. Instead it "consolidates statistical data on over 80,000 topics from more than 22,500 sources," according to its own description.

The research mentioned did not come from Statista, but from Cloud Wars. Citing Statista as the source is like attributing a statement discovered in a Google search as having the authority of Google. The risk of confusion like this is that a poor source can be promoted to a better one (and that is not intended to suggest that the Cloud Wars data is inaccurate). ®

Send us news
83 Comments

Microsoft .NET MAUI devs vent over bugs backlog, response times

New features are great and all, but maybe fix some of the issues too?

Atlassian users complain of cloud migration dead ends, especially in UK

Lack of local clouds and inflexible offers see users depart. Maybe the new ‘Compass’ developer experience tool will be more to their liking

Microsoft says VBScript will be ripped from Windows in future release

It's PowerShell or something similar in the not too distant future

Unity CEO 'retires' in the wake of fee fiasco

Ex-Red Hat CEO James M Whitehurst takes the big chair in the interim

Forcing Apple to allow third-party app stores isn't enough

You're excited about Meta offering iOS apps via Facebook ads? Really?

Make-me-root 'Looney Tunables' security hole on Linux needs your attention

What's up, Doc? Try elevated permissions

If you want to fund open source code via Patreon with GitHub, well now you can

Exploited open source maintainers get broader payment pleading options

And now for something completely different: Python 3.12

Nobody expects more flexible string parsing

Unity apologizes, tweaks runtime install fees after gaming world outrage

Is this the engine maker's final continue?

The world seems so loopy. But at least someone's written a memory-safe sudo in Rust

Turns out we can have nice things?

Meta lets Code Llama run riot under almost-open terms

Was this source-generating model used to bang out Threads in a week or something?

Magento shopping cart attack targets critical vulnerability revealed in early 2022

Really? You didn't bother to patch a 9.8 severity critical flaw?