Software

AI + ML

Coding unit tests is boring. Wouldn't it be cool if an AI could do it for you? That's where Diffblue comes in

A big time saver – but 'we can't tell if the current logic that you have in the code is correct or not.' Oh


Oxford-based Diffblue has claimed its AI will automate one of the most important but tedious tasks in software development: writing unit tests.

Test-driven development (TDD) is a methodology invented – or, as he has said, rediscovered – by Kent Beck, who wrote a unit test framework for Smalltalk in the late '80s. The idea of exercising code with unit tests, which run the code and check that the output is as expected, is now widely accepted as best practice.

Unit tests help to avoid regressions – bugs introduced into code that previously worked correctly – and are a critical part of CI/CD (Continuous Integration and Continuous Delivery) since they give the developer confidence that an application or service still works after they add or modify the code. It is therefore hard to maintain rapid velocity – frequent releases – without rigorous unit testing. The popular SQLite database engine has 640 times as much testing code as code in the engine itself.

Writing unit tests may be important, but it is less interesting than adding features. "It is tedious grunt work; it's very important, but it is the first thing to go when the team is under time pressure," Mathew Lodge, CEO of Diffblue, told The Register. "It's something that humans are not very good at, and they make lots of mistakes as well because it's boring."

Take Cover...

Diffblue was spun out of the University of Oxford following research into how to use AI to write tests automatically. There are already plenty of tools that generate unit tests, but in general they are template-based and rely on developers to add the logic. Diffblue's Cover, on the other hand, writes everything. "We write a full set of unit tests that compile and pass. It's a full unit test suite that reflects the current behaviour of the program so that when you make a change, you can find out from the test behaviour what you have changed and so you catch regressions," said Lodge.

Diffblue Cover running AI-generated tests on the sample Spring Boot application Petclinic (click to enlarge)

Cover has now been released as a free Community Edition. It only works with Java, and the only IDE integration is with IntelliJ IDEA, though the paid-for version also has a command-line option.

"As a small company we want to do one thing really well first," said Lodge. "The core technology is language independent so when we analyse the program we build a model of the program that we can reason about, then we are running tests, we again use a generic representation of the test which we then translate into Java."

Lodge said that JavaScript and Python are common requests, as is support for Visual Studio Code for which there is already an early alpha version.

Let's have a play then

We wrote a new method for the Spring Boot Petclinic sample, which includes a database of pets and their owners. Our method is HasPet(), which determines whether an owner actually has a pet. Right-click the method, select Write Test, and Cover generates two test methods. The first creates a new owner but no pet, calls the method and asserts that it is false. The second test creates a new owner and a pet, assigns the pet to the owner, calls the method and asserts it to be true. Impressive.

There is a snag, though. We modified HasPet() so it has a bug. It now returned true when it should be false, and vice versa. We asked Cover to generate new tests. The new tests passed since Cover did not know the intent of the code, only what it actually did. That said, Cover left the old tests in place, and they duly failed, so we did have some clue that there was a problem. Had we written the bug in the original code, though, the Cover test would have been useless – unless, perhaps, the developer inspected the test code and questioned its assertions.

Lodge acknowledged the problem, telling us: "The code might have bugs in it to begin with, and we can't tell if the current logic that you have in the code is correct or not, because we don't know what the intent is of the programmer, and there's no good way today of being able to express intent in a way that a machine could understand.

"That is generally not the problem that most of our customers have. Most of our customers have very few unit tests, and what they typically do is have a set of tests that run functional end-to-end tests that run at the end of the process."

Lodge's argument is that if you start with a working application, then let Cover write tests, you have a code base that becomes amenable to high velocity delivery. "Our customers don't have any unit tests at all, or they have maybe 5 to 10 per cent coverage. Their issue is not that they can't test their software: they can. They can run end-to-end tests that run right before they cut a release. What they don't have are unit tests that enable them to run a CI/CD pipeline and be able to ship software every day, so typically our customers are people who can ship software twice a year."

Diffblue Cover creating tests for the Petclinic application (click to enlarge)

The reason for the lack of unit tests may be time pressure or may be historical. "Most organisations build on existing applications, and that is the biggest challenge for folks like banks. You have all of this Java code that basically runs the bank, you have a way to ship it, because you have tests that you can run at the end of the process, but what you don't have are tests that you can run after every single commit."

How does Diffblue Cover work? "It's a combination of static and dynamic analysis," said Lodge. "We write what we think is a good test to get a starter. Then we run it against the code and we observe the behaviour of the method. From running it we can see what the method does, with side effects as well as the return value, and then we go looking for a better test than the one that we generated. Then it's a probabilistic search of the space of possible test cases."

Interested parties can review some of the research behind this process on the Diffblue site.

Diffblue emerged out of a partnership with Goldman Sachs, hence its skew towards the banking sector. "Goldman Sachs followed the company because they were very interested in the technology, Goldmans helped us build the product and essentially we built the first version with Goldman's help," said Lodge. "What you see today in the community edition is version 2 of the product, with everything we learned from that first experience. There hasn't been a tool like this before. The purpose of the Community Edition is to have a free way for people to see what the tool can do.

"We can write a test with full mocking in about 600 milliseconds. So we are 10 to 100 times faster than humans at writing these tests."

Cover does a great job of exercising the developer's code, but unfortunately only a human will know if it is working as intended. ®

Send us news
30 Comments

US trade watchdog opposes Nvidia's Arm buy, mostly over fears about data center innovation

FTC sues to block deal because it would be bad for competition

The US Federal Trade Commission, having previously expressed unease about Nvidia's plan to acquire UK chip design firm Arm, acted on its concern Thursday by suing to prevent the deal.

"The FTC is suing to block the largest semiconductor chip merger in history to prevent a chip conglomerate from stifling the innovation pipeline for next-generation technologies,” said FTC Bureau of Competition Director Holly Vedova, in a statement. "Tomorrow’s technologies depend on preserving today’s competitive, cutting-edge chip markets."

Nvidia's acquisition offer – a cash-plus-shares bid that was announced at $40bn and is now worth more than $50bn, thanks to the rising value of the 44.3m Nvidia shares to be issued to Arm if and when the deal is consummated – was already under scrutiny from the UK Competition and Markets Authority.

Continue reading

BadgerDAO DeFi defunded as hackers apparently nab millions in crypto tokens

Badger, badger, badger, coin theft, coin theft!

BadgerDAO, maker of a decentralized finance (DeFi) protocol, said on Wednesday that it is investigating reports that millions in user funds have been stolen.

"As Badger engineers investigate this, all smart contracts have been paused to prevent further withdrawals," the company wrote in a Twitter post. "Our investigation is ongoing and we will release further information as soon as possible."

PeckShield, a blockchain security firm, put the losses at $120.3 million, if translated to fiat currency.

Continue reading

Texas' anti-moderation social network law blocked by judge

Enforcing rules on content is in everybody's interest, court decides

A federal judge on Wednesday blocked Texas legislation banning large social media companies from moderating content, one day before the law was due to come into effect.

Under the law, HB20, social media platforms with over 50 million monthly active users in the US are prohibited from removing content posted by users, especially if they’re posting within Texas, unless it's unlawful. The bill was signed into law by the state’s Governor Greg Abbott on 9 September, earlier this year.

The law was challenged, however, when two IT trade groups filed a lawsuit in an attempt to block the law from being enforced. Netchoice and the Computer & Communications Industry Association (CCIA) argued HB20 violated First Amendment rights by forcing companies to host content they didn’t agree with.

Continue reading

You've seen the Raspberry Pi CM4 in a mini-ITX case. Now here's four in a mini-ITX case

How to coordinate 16 Arms

Keen on Kubernetes? It has been a long wait, but the Turing Pi 2 is finally close to shipping.

A year and a bit after the Raspberry Pi Compute Module 4 shipped, and one of our crafty commenters noticed that a new version was coming, the Turing Pi 2 board is close enough to shipping that zealous Pi-related YouTuber Jeff Geerling has got his hands on one.

Unlike the Alftel Seaberry we covered last month, this is not a Pi CM4 in a mini-ITX case. No, it's four Pi CM4s in a mini-ITX case. No need to imagine a Beowulf cluster of these: it's specifically designed to build such a thing, or more contemporaneously, a Kubernetes cluster of them.

Continue reading

Ubiquiti dev charged with knocking $4bn off firm's value after insider threat spree

Prosecutors claim Nickolas Sharp even posed as a whistleblower to press

A Ubiquiti developer has been charged with stealing data from the company and extortion attempts totalling $2m in what prosecutors claim was a vicious campaign to harm the firm's share price – including allegedly planting fake press stories about the breaches.

US federal prosecutors claimed that 36-year-old Nickolas Sharp had used his "access as a trusted insider" to steal data from his employer's AWS and GitHub instances before "posing as an anonymous hacker" to send a ransom demand of 50 Bitcoins.

The DoJ statement does not mention Sharp's employer by name, but a Linkedin account in Sharp's name says he worked for Ubiquiti as a cloud lead between August 2018 and March 2021, having previously worked for Amazon as a software development engineer.

Continue reading

Microsoft makes tweaks to Windows 11 Start Menu for Insiders but stops short of mimicking Windows 10

If it's not broke, don't f- ... never mind

Microsoft's long-suffering unpaid testers are to start seeing some improvements in the Windows 11 User Interface.

Build 22509 arrived last night for Windows Insiders on the Dev Channel and, as well as making things a bit more accessible by improving the web browsing experience with Microsoft's Edge browser and the Narrator, there were some much-needed tweaks to the Start Menu and Taskbar.

Starting with the most jarring change in the user experience for Windows 11, the Start Menu, some easy-to-access options were added. One can opt for more pinned applications or more recommendations to control how many rows of either are displayed. It's not quite the "make it like Windows 10" that some users have requested, but it's a step in the right direction.

Continue reading

ESA's Mars Express picks up plaintive bleeps of China's Zhurong rover, adding much-needed comms redundancy

We're all ears

The European Space Agency (ESA) has confirmed that its Mars Express orbiter has heard from China's Zhurong rover.

The experiment was to demonstrate that it was possible to relay data from Zhurong back to Earth via the veteran orbiter. In itself not unusual. However, while there is normally some handshaking to be done between spacecraft and trundlebot, two-way exchanges are not possible with Zhurong using the frequencies transmitted by Mars Express. The orbiter therefore had to listen for signals as it sailed serenely overhead.

Experiments began in November and have now concluded. And the result? It worked.

Continue reading

Santa's sack is bulging with browsers: Vivaldi 5.0 arrives full of festive cheer

Keeping one's privates private

"I don't think we have any business with collecting information about what people are doing," Vivaldi CEO Jon von Tetzchner told The Register as its eponymous browser pushed out a major version update today.

The latest increment includes new themes and translations, although we put it to von Tetzchner that perhaps there wasn't an awful lot in the there to justify the jump to version 5. As one would expect, he disagreed.

"If you look at the desktop side," he said, "let's start with the translate panel… we have our own translation hardware, which we are hosting in Iceland. I think that's a big deal."

Continue reading

Co-Operative Bank today 'terminated' Capita's outsourcing contract years before it was due to expire

Services ops for mortgages to go back in-house, says High Street lender, can't say how many to TUPE across

Co-Operative Bank is terminating its outsourcing contract with Capita years ahead of schedule and is planning to TUPE across staff to provision services in-house again, ending what at times was a fractious relationship.

A six-year agreement for Capita to run the Bank's mortgage services operation was signed in 2015 worth £325m, it included handling customer queries and applications and mortgage maturity, as well as digitising processes.

Yet the following year the companies fell out, with Co-Operative Bank threatening litigation over alleged failings regarding digital transformation service delivery.

Continue reading

UK data watchdog fines government office for disclosing New Year's gong list

New IT systems set up incorrectly, published CSV files which included names, addresses

The UK's Information Commissioner's Office (ICO) has fined the Cabinet Office because it failed to put appropriate technical and organisational measures in place to prevent the unauthorised disclosure of recipients of New Year's honours.

Twice a year, the government dishes out a mixed bag of honours – knighthood and Order of the Bath etc. – to a list of people deemed worthy.

The ICO has now fined the Cabinet Office – the unit that works across government departments on behalf of the prime minister – £500,000 for the unauthorised disclosure of people's information, which is a breach of data protection law, during the 27 December 2019 gong bonanza.

Continue reading

SiFive's latest top-end RISC-V CPU core supports proper virtualization in hardware

Hypervisor extension implemented in P650 processor engine that's stalking Arm's Cortex family

SiFive's latest flagship RISC-V CPU will be revealed today – and we're told it will sport proper virtualization support in hardware.

The Performance P650 was teased in October, and follows the P550 unveiled in June.

The P650 is offered as an application core you can license to drop into your system-on-chip, and run Linux and other OSes on it.

Continue reading