Picture the scene: you're a developer looking at someone else's code for the first time, and you can see that a lot needs changing.
Performance bugs mean it won't scale for much longer. The code design makes it difficult to port to another language, which is going to cripple that Latin American business expansion. Its architecture means that if you pull on a piece of code over here, something breaks over there. It needs fixing – badly.
The trouble is the project sponsor wants a new feature implemented this quarter, and there are only so many developer hours. "How bad are the problems? Quantify them or forget it," she says. Where on earth do you start?
What we're talking about here is technical debt. The decisions that developers make early on to get the code out the door can come back and bite them – or their successors – later. Outside the rarified world of agile computing and academia, few people look at this at all, let alone try to measure it.
Perhaps we can get an answer from the person who applied the debt metaphor to code in the first place: agile software developer Ward Cunningham. Technical debt is not about poorly written software, he says. He doesn't condone sloppy code. Instead, it's a strategy to trade an imperfect understanding of the problem domain for a quick code release.
Sometimes you don't know everything you need to know about what you're coding, but you have deadlines to hit. You ship the best code you can in line with what you know at the time. That creates technical debt. When you adjust the software later to better reflect reality and keep it relevant, that's paying down your debt.
The trick comes in working out how much debt to pay down by fixing software behind the scenes, and how much to let ride because you're busy building new features into the system to please the business. Get it wrong, and you can watch your software atrophy and die.
"I have become convinced that those helpless in the face of 'debt' are simply bad managers that manufacture debt due [to] decision making based on fantasy," Cunningham told The Register.
You can better manage it by measuring it, but what yardstick do you use? Cunningham asks what percentage of a team's time is spent doing work that can't be explained in a way that the customer is happy to pay for.
That's a high-level definition, though, and he points out that a developer's familiarity with the code, rather than the code itself, would affect the level of technical debt based on that measurement.
No metric to rule them all
There are different measurement mechanisms for technical debt, says Ipek Ozkaya, principal member of the technical staff at Carnegie Mellon University's Software Engineering Institute.
Ozkaya thinks of technical debt across three broad categories. Alongside code, where you're concerned with issues like structure and quality, she considers architecture, where you're focusing on which decisions led to short-term wins at the expense of more efficient, flexible ways to put the software together.
Finally, she highlights the production environment. DevOps' underlying automation tools are all code, she points out. Identifying and quantifying chokepoints there that may slow down rollouts can benefit the entire software lifecycle.
Considering these three areas highlights the difficulty of finding a single yardstick for technical debt. "There's not one magic metric out there, but that doesn't mean that there's no hope," she says. "It comes down to where your technical debt is and what you're trying to manage."
Some organisations might focus on the ability to add new functionality. Others might be dealing with unintentional technical debt like accrued quality issues in the software. They may need to know which ones to fix first based on how much they hinder the business.
We need categories for these metrics. Gartner derives them from ISO's ISO 25010 standard on software quality as a framework. It identifies seven types of nonfunctional requirement: reliability, security, usability, maintainability, portability, compatibility, and performance efficiency.
The costs associated with each of these will be different. For example, performance efficiency debt will incur higher operational costs. Security debt will bump up business risk costs. Usability debt (crummy interfaces) will increase business costs due to error and impeded user efficiency.
Gartner's research note on technical debt still doesn't give us a way to come up with actual numbers. That's where Dr Bill Curtis hopes that the new Automated Technical Debt Measure standard will provide some clarity.
Curtis is the executive director of the Consortium for IT Software Quality (CISQ), founded by CMU's SEI along with the Object Management Group (OMG). He gives props to Cunningham's definition of technical debt, but adds that CISQ's differs slightly. "We take a more industrial approach," he says.
Formed in 2010, this special interest group focuses on the cost of ownership around technical debt; the ability to translate software debt into costs. Industry users want to predict corrective maintenance costs, he says. They want to identify troubled applications, to guide repair or replace decisions, and to evaluate the capabilities of their team. That takes hard numbers.
It also takes a focus on clear and present dangers, rather than code issues that you can live with for a long time. The focus here is on severe flaws. "Some problems you can defer forever because they’re not critical and you don't have to fix them right away," Curtis says. "We don't count those as technical debt."
Any debt has two parts: principal and interest. In CISQ's view, the principal is the cost of fixing severe architectural and coding flaws in already-released software. The interest is the ongoing IT cost that those flaws are generating. That might translate into more developer hours, increased usage of IT resources, or downtime because of software failure. In this world view, the interest on a flaw is a function of that flaw's severity.
Those metrics translate directly into two kinds of business risk. The first is the cost of opportunity, says Curtis, which are the benefits you could have achieved by using resources to develop new features rather than fix the existing software.
The second risk relates to liability. "As we digitise the entire corporation, the risk to the business is greater and greater. CEOs are fed up with their jobs being at risk because some programmer screwed something up, took the system down and cost them a hundred million bucks," he says.
The liability issue makes technical debt a governance issue. "CEO jobs are on the line for something that they don't even begin to understand," Curtis points out.
For its technical measurement standard, CISQ chose four quality characteristics from ISO 25010: reliability, performance efficiency, security, and maintainability. It identified 86 severe violations across these areas that static analysis tools could find.
It built the technical debt measurements on top of those weaknesses, using the violations as a basis to calculate the cost of quality. It surveyed developers (primarily working in Java and .NET) to find out how long it would take to fix each of the defects in the simplest case, and used that as the default score.
The CISQ team also uses several adjustment factors to move the scale. These can include the structural complexity and the diversity of language throughout the system, along with the concentration of weaknesses (which can help make fixes simpler).
Development teams can sum the adjusted values for all instances of a weakness in the system to provide a specific technical debt measure for that weakness. They can then go further and add the technical debt scores for all violations in a software quality area such as security. Summing each of the four software quality areas will produce an overall technical debt score.
Based on these scores, teams can start making decisions based on quantifiable data. "You can start saying 'for this application, which are the ones that we most need to fix'. And then you can build remediation plans," he says.
The future of technical debt
Applying this level of analysis to technical debt will be a big undertaking for many companies. Those without the resources or know-how to crunch the numbers and convert them into business decisions may find this an entirely academic exercise. How can they bridge the gap between what's ideal and what's possible?
Development teams might make it easier to measure technical debt by documenting it as they go, argues SEI's Ozkaya. By tagging different kinds of technical debt in their version control system, they could quantify it from the ground up and maintain a running tally of problems.
While developers grapple with the problem, experts are already eyeing other emerging issues. "More relevant to current systems is the degree that their behaviour cannot be predicted due to their internal structure," argues Cunningham. He's talking about opaque algorithms like machine learning here. It is harder to measure technical debt in these systems because they make decisions that are not explicitly programmed.
Curtis is already eyeing a related area of development: IoT. CISQ will extend its debt measuring concept into embedded real-time systems, which will enable people to produce technical debt measurements for these massive collections of edge-based sensors.
Now, there's a challenge. These IoT ecosystems are a magnet for technical debt. A decade or two hence, most of the IoT kit we're installing now might still be working. The cost of updating that to meet tomorrow’s requirements might be too scary to quantify. ®
We'll be covering DevOps at our Continuous Lifecycle London 2018 event. Full details right here.