Software

AI + ML

Coding unit tests is boring. Wouldn't it be cool if an AI could do it for you? That's where Diffblue comes in

A big time saver – but 'we can't tell if the current logic that you have in the code is correct or not.' Oh


Oxford-based Diffblue has claimed its AI will automate one of the most important but tedious tasks in software development: writing unit tests.

Test-driven development (TDD) is a methodology invented – or, as he has said, rediscovered – by Kent Beck, who wrote a unit test framework for Smalltalk in the late '80s. The idea of exercising code with unit tests, which run the code and check that the output is as expected, is now widely accepted as best practice.

Unit tests help to avoid regressions – bugs introduced into code that previously worked correctly – and are a critical part of CI/CD (Continuous Integration and Continuous Delivery) since they give the developer confidence that an application or service still works after they add or modify the code. It is therefore hard to maintain rapid velocity – frequent releases – without rigorous unit testing. The popular SQLite database engine has 640 times as much testing code as code in the engine itself.

Writing unit tests may be important, but it is less interesting than adding features. "It is tedious grunt work; it's very important, but it is the first thing to go when the team is under time pressure," Mathew Lodge, CEO of Diffblue, told The Register. "It's something that humans are not very good at, and they make lots of mistakes as well because it's boring."

Take Cover...

Diffblue was spun out of the University of Oxford following research into how to use AI to write tests automatically. There are already plenty of tools that generate unit tests, but in general they are template-based and rely on developers to add the logic. Diffblue's Cover, on the other hand, writes everything. "We write a full set of unit tests that compile and pass. It's a full unit test suite that reflects the current behaviour of the program so that when you make a change, you can find out from the test behaviour what you have changed and so you catch regressions," said Lodge.

Diffblue Cover running AI-generated tests on the sample Spring Boot application Petclinic (click to enlarge)

Cover has now been released as a free Community Edition. It only works with Java, and the only IDE integration is with IntelliJ IDEA, though the paid-for version also has a command-line option.

"As a small company we want to do one thing really well first," said Lodge. "The core technology is language independent so when we analyse the program we build a model of the program that we can reason about, then we are running tests, we again use a generic representation of the test which we then translate into Java."

Lodge said that JavaScript and Python are common requests, as is support for Visual Studio Code for which there is already an early alpha version.

Let's have a play then

We wrote a new method for the Spring Boot Petclinic sample, which includes a database of pets and their owners. Our method is HasPet(), which determines whether an owner actually has a pet. Right-click the method, select Write Test, and Cover generates two test methods. The first creates a new owner but no pet, calls the method and asserts that it is false. The second test creates a new owner and a pet, assigns the pet to the owner, calls the method and asserts it to be true. Impressive.

There is a snag, though. We modified HasPet() so it has a bug. It now returned true when it should be false, and vice versa. We asked Cover to generate new tests. The new tests passed since Cover did not know the intent of the code, only what it actually did. That said, Cover left the old tests in place, and they duly failed, so we did have some clue that there was a problem. Had we written the bug in the original code, though, the Cover test would have been useless – unless, perhaps, the developer inspected the test code and questioned its assertions.

Lodge acknowledged the problem, telling us: "The code might have bugs in it to begin with, and we can't tell if the current logic that you have in the code is correct or not, because we don't know what the intent is of the programmer, and there's no good way today of being able to express intent in a way that a machine could understand.

"That is generally not the problem that most of our customers have. Most of our customers have very few unit tests, and what they typically do is have a set of tests that run functional end-to-end tests that run at the end of the process."

Lodge's argument is that if you start with a working application, then let Cover write tests, you have a code base that becomes amenable to high velocity delivery. "Our customers don't have any unit tests at all, or they have maybe 5 to 10 per cent coverage. Their issue is not that they can't test their software: they can. They can run end-to-end tests that run right before they cut a release. What they don't have are unit tests that enable them to run a CI/CD pipeline and be able to ship software every day, so typically our customers are people who can ship software twice a year."

Diffblue Cover creating tests for the Petclinic application (click to enlarge)

The reason for the lack of unit tests may be time pressure or may be historical. "Most organisations build on existing applications, and that is the biggest challenge for folks like banks. You have all of this Java code that basically runs the bank, you have a way to ship it, because you have tests that you can run at the end of the process, but what you don't have are tests that you can run after every single commit."

How does Diffblue Cover work? "It's a combination of static and dynamic analysis," said Lodge. "We write what we think is a good test to get a starter. Then we run it against the code and we observe the behaviour of the method. From running it we can see what the method does, with side effects as well as the return value, and then we go looking for a better test than the one that we generated. Then it's a probabilistic search of the space of possible test cases."

Interested parties can review some of the research behind this process on the Diffblue site.

Diffblue emerged out of a partnership with Goldman Sachs, hence its skew towards the banking sector. "Goldman Sachs followed the company because they were very interested in the technology, Goldmans helped us build the product and essentially we built the first version with Goldman's help," said Lodge. "What you see today in the community edition is version 2 of the product, with everything we learned from that first experience. There hasn't been a tool like this before. The purpose of the Community Edition is to have a free way for people to see what the tool can do.

"We can write a test with full mocking in about 600 milliseconds. So we are 10 to 100 times faster than humans at writing these tests."

Cover does a great job of exercising the developer's code, but unfortunately only a human will know if it is working as intended. ®

Send us news
30 Comments
Get our AI newsletter

Four key challenges in your move toward an all-flash data center in the Intelligent Age

And how Huawei tackles them all

Sponsored Explosive data has become the core means of production and the catalyst for the digital economy. In the next five to 10 years, the amount of data to be stored will increase from 32 ZB in 2018 to 180 ZB by 2025. This data explosion will further drive the maturity of the data value chain and propel enterprises' decision-making and innovation.

We are at the dawn of an Intelligent Era, and data centre operators need to rise to the challenge. To take one example, when building new data centres, they should evaluate all-flash options. Let’s take a look at some of the best practices that they should consider.

Continue reading

Satellite collision anticipated by EU space agency fails to materialize... for now at least

Internet rubberneckers and crisis-starved media left to ponder non-event

Two days ago, the EU Space Surveillance and Tracking (EU SST) initiative warned of a possible collision on Friday between two orbiting objects, but it now appears they passed each other without incident.

The two chunks of space junk are identified as OPS 6182 (1978-042A), a defunct US meteorological satellite, and SL-8 R/B (1981-041B), a rocket body launched in 1971 by the former Soviet Union to deliver a satellite into orbit.

Initially, EU SST estimated the chance of collision at above 1 per cent, and by Thursday, that figure had been revised upward to more than 20 per cent. The abandoned pieces of equipment were initially expected to come within 10m of each other, an uncomfortably small gap given the possible consequences.

Continue reading

Wormhole encrypted file transfer app reboots Firefox Send after Mozilla fled

App's developers believe they can manage potential abuse

Earlier this month, a startup called Socket, Inc., launched Wormhole, a web app for encrypting files and making them available to those who receive the URL-embedded encryption key, without exposing the files to the cloud-based intermediary handling the transfer.

That may sound a bit like what Mozilla tried to do with Firefox Send, launched in 2017 and shut down a year and a half later. And that's intentional.

"Wormhole is a reboot of Firefox Send, but with many improvements," explained Feross Aboukhadijeh, a widely known open source developer and co-founder of Socket, in an email to The Register. "We loved Firefox Send and were so disappointed when it was shut down that we decided to rebuild it, but with additional enhancements."

Continue reading

Texan's alleged Amazon bombing effort fizzles: Militia man wanted to take out 'about 70 per cent of the internet'

Someone hasn't heard of redundancy

The US Justice Department on Friday announced the arrest of Seth Aaron Pendley, 28, for allegedly planning to blow up a single Amazon data center in Ashburn, Virginia, which he thought would knock out around 70 per cent of the internet.

Pendley, the feds said, was arrested on Thursday after supposedly trying to buy explosives from an undercover agent in Fort Worth, Texas. He came to the attention of authorities after someone alerted the FBI on January 8, 2021 – two days after the violent US Capitol insurrection – to troubling statements posted by the suspect to MyMilitia.com, a forum for organizing militia groups.

Pendley's Facebook account, it's claimed, shows his boasting about participating in the protests in Washington, DC, on January 6. He's said to have told friends in private messages that he didn't enter the Capitol building but did manage to reach a platform outside where he took a piece of broken glass and "interacted" with the police.

Continue reading

Lenovo's latest gaming monster: Eight cores, 3.2GHz, giant heat sink, two fans. Oh, and it has a phone bolted on

Mammoth as a mobe, but serious as a game device

Lenovo's latest tech features top-shelf components and new cooling technologies.

Designed for gaming, the Chinese firm claims it provides a 35 per cent performance boost plus a full suite of premium features over the previous generation. The kit is also equipped with Qualcomm's Snapdragon 888 5G mobile platform. Yep. Because it is a phone.

Here are some specs for the Lenovo Legion Phone Duel 2:

Continue reading

Amazon claims victory after warehouse workers in Alabama vote to reject union

Retail union accuses the tech giant of illegally swaying votes, files complaint

Amazon warehouse workers in Bessemer, Alabama, voted against unionization, according to results announced on Friday.

The battle waged by pro-union workers, represented by the Retail, Wholesale and Department Store Union (RWDSU), was regarded as a crucial first step for fighting against working conditions at Amazon’s so-called fulfillment centers. The threat to the e-commerce giant could potentially set a precedent for other warehouses across the US to unionize.

But their efforts were shut down, after the majority of their colleagues voted against them. “Thank you to employees at our BHM1 fulfillment center in Alabama for participating in the election,” Amazon said in a statement.

Continue reading

State of Iowa approves $17m in budget for Workday project after bid to use coronavirus relief funds was denied

Questions raised about procurement process but, gosh, they badly need a replacement HR system

The US State of Iowa has approved $17m in its 2022 budget to replace an HR system dating back to the 1980s with Workday software.

Opposition state representative Chris Hall reportedly refused to back the funding, raising concerns about the lack of competitive bidding for the $52m, five-year project, which will also replace government financial planning software.

Questions have been asked about the procurement as former chief of staff to the state Jake Ketzner is now a lobbyist for the California SaaS specialist. Enthusiasts can listen to local newshound Erin Jordan get a firm "no comment" from Workday's Ketzner before he hangs up to go into a "meeting" here.

Continue reading

SpaceX's Starlink: Overhyped and underpowered to meet broadband needs of Rural America, say analysts

As the constellation stands anyway

SpaceX's Starlink has been described as the solution to dismal rural broadband. Like any project linked to Elon Musk, the satellite internet constellation is surrounded by a thick cloud of hype. But is it justified?

Analyst house MoffetNathanson isn't sure. A new report published earlier this week expressed doubts about Starlink's ability to cover the US market in its current form, citing the bandwidth concerns and end-user consumption rates.

The outfit suggested Starlink's total addressable market, based on the company reaching its lofty goals to deploy 12,000 satellites, hovers between just 300,000 and 800,000 households.

Continue reading

NASA's Mars helicopter spins up its blades ahead of hoped-for 12 April hover

Things to look forward to on Monday morn: Our Who, Me? column and 1st flight of Ingenuity

Updated The Ingenuity Mars Helicopter is set to take its first flight after engineers spun its blades up to 50rpm in preparation.

The downlink from the first flight is due on 12 April at 0730 UTC (0330 ET) with a postflight briefing scheduled for 1500 UTC (1100 ET). The dates currently carry a "not earlier than" prefix as engineers keep an eye on Martian conditions, but the testing of the diminutive device's rotors indicates there is every chance the first flight will go ahead.

Continue reading

UK's National Cyber Security Centre recommends password generation idea suggested by El Reg commenter

Who says everything below the line is a cesspit of useless filth?

Nearly a third of Britons use the name of their pet or a family member as a password, the National Cyber Security Centre has said as it advised folk to adopt what looks very much like a Register forum user's suggestion for secure password generation.

A survey of 1,282 British adults commissioned by the NCSC showed that 15 per cent used a pet's name while 14 per cent use the name of a family member as a password.

The old staples of "123456" and "password" still each account for 6 per cent of login phrases used by Brits, the GCHQ offshoot found.

Continue reading

KPMG wins Bournemouth, Christchurch and Poole Council's £18m everything-and-the-kitchen-sink IT deal

From org design to developing operations, consulting-outsourcing giant carries the can

Consultancy and outsourcing firm KPMG has been awarded an £18m contract to, for all intents and purposes, create the entire back-end operations, processes and technology system for the recently formed Bournemouth, Christchurch and Poole Council.

Legally born on 1 April 2019 from the merger of constituent councils, BCP Council has been looking for a "more fundamental transformation... to fully realise the opportunities that local government reorganisation can bring, as well as remove the complexity, duplication and therefore cost of the operating model," according to the contract award notice.

KPMG, it appears, is the supplier to do that. It was inevitable, perhaps, because the consultancy giant had already developed the Organisational Design for the council [PDF].

Continue reading