In the red corner: Malware-breeding AI. And in the blue corner: The AI trying to stop it

Behind the scenes of infosec's cat-and-mouse game


Feature The magic AI wand has been waved over language translation, and voice and image recognition, and now: computer security.

Antivirus makers want you to believe they are adding artificial intelligence to their products: software that has learned how to catch malware on a device. There are two potential problems with that. Either it's marketing hype and not really AI – or it's true, in which case don't forget that such systems can still be hoodwinked.

It's relatively easy to trick machine-learning models – especially in image recognition. Change a few pixels here and there, and an image of a bus can be warped so that the machine thinks it’s an ostrich. Now take that thought and extend it to so-called next-gen antivirus.

Enter Endgame, a cyber-security biz based in Virginia, USA, which you may recall popped up at DEF CON this year. It has effectively pitted two machine-learning systems against each other: one trained to detect malware in downloaded files, and the other is trained to customize malware so it slips past the aforementioned detector. The aim is to craft software that can manipulate malware into potentially undetectable samples, and then use those variants to improve machine-learning-based scanners, creating a constantly improving antivirus system.

The key thing is recognizing that software classifiers – from image recognition to antivirus – can suck, and that you have to do something about it.

“Machine learning is not a one-stop shop solution for security,” said Hyrum Anderson, principal data scientist and researcher at Endgame. He and his colleagues have teamed up with researchers from the University of Virginia to create this aforementioned cat and mouse game that breeds better and better malware and learns from it.

“When I tell people what I’m trying to do, it raises eyebrows,” Anderson told The Register. “People ask me, ‘You’re trying to do what now?’ But let me explain.”

Generating adversarial examples

A lot of data is required to train machine learning models. It took ImageNet – which contains tens of millions of pictures split into thousands of categories – to boost image recognition models to the performance possible today.

The goal of the antivirus game is to generate adversarial samples to harden future machine learning models against increasingly stealthy malware.

To understand how this works, imagine a software agent learning to play the game Breakout, Anderson says. The classic arcade game is simple. An agent controls a paddle, moving it left or right to hit a ball bouncing back and forth from a brick wall. Every time the ball strikes a brick, it disappears and the agent scores a point. To win the game, the brick wall has to be cleared and the agent has to continuously bat the ball and prevent it from falling to the bottom of the screen.

Endgame’s malware game is somewhat similar, but instead of a ball the bot is dealing with malicious Windows executables. The aim of the game is to fudge the file, changing bytes here and there, in a way so that it hoodwinks an antivirus engine into thinking the harmful file is safe. The poisonous file slips through – like the ball carving a path through the brick wall in Breakout – and the bot gets a point.

It does this by manipulating the contents, and changing the bytes in the malware, but the resulting data must still be executable and fulfill its purpose after it passes through the AV engine. In other words, the malware-generating agent can't output a corrupted executable that slips past the scanner but, due to deformities introduced in the binary to evade detection, it crashes or doesn't work properly when run.

The virus-cooking bot is rewarded for getting working malicious files past the antivirus engine, so over time it learns the best sequence of moves for changing a malicious files in a way that it still functions and yet tricks the AV engine into thinking the file is friendly.

“It’s a much more difficult challenge than tricking image recognition models. The file still has to be able to perform the same function and have the same format. We’re trying to mimic what a real adversary could do if they didn’t have the source code,” says Anderson.

It’s a method of brute force. The agent and the AV engine are trained on 100,000 input malware seeds – after training, 200 malware files are given to the agent to tamper with. These samples were then fed into the AV engine and about 16 per cent of evil files dodged the scanner, we're told. That seems low, but imagine crafting a strain of spyware that is downloaded and run a million times: that turns into 160,000 potentially infected systems to your control. Not bad.

After the antivirus engine model was updated and retrained using those 200 computer-customized files, and it was given another fresh 200 samples churned from the virus-tweaking agent, the evasion rate dropped to half as the scanner got wise to the agent's tricks.


Other stories you might like

  • Apple wins Epic court ruling: Devs will pay up for now as legal case churns on

    Previous injunction that ordered company to allow non-Apple payments systems is suspended

    Apple will not be required to implement third-party in-app payments systems for its App Store by 9 December, after a federal appeals court temporarily suspended the initial ruling on Wednesday.

    As part of its ongoing legal spat with Epic, a judge from the Northern District Court of California said Apple wasn’t a monopoly, but agreed it’s ability to swipe up to a 30 per cent fee in sales processed in iOS apps was uncompetitive. Judge Yvonne Gonzalez Rogers ordered an injunction, giving the iGiant 90 days to let developers add links or buttons in their apps to direct users to third-party purchasing systems.

    Those 90 days were set to end on 9 December. If developers were allowed to process financial transactions using external systems they wouldn’t have to hand over their profits to Apple, they argued. When Apple tried to file for a motion to stay, which would pause the injunction until it filed an appeal, Rogers denied its request.

    Continue reading
  • Meg Whitman – former HP and eBay CEO – nominated as US ambassador to Kenya

    Donated $110K to Democrats in recent years

    United States president Joe Biden has announced his intention to nominate former HPE and eBay CEO Meg Whitman as Ambassador Extraordinary and Plenipotentiary to the Republic of Kenya.

    The Biden administration's announcement of the planned nomination reminds us that Whitman has served as CEO of eBay, Hewlett Packard Enterprise, and Quibi. Whitman also serves on the boards of Procter & Gamble, and General Motors.

    The announcement doesn't remind readers that Whitman has form as a Republican politician – she ran for governor of California in 2010, then backed the GOP's Mitt Romney in his 2008 and 2012 bids for the presidency. She later switched political allegiance and backed the presidential campaigns of both Hillary Clinton and Joe Biden.

    Continue reading
  • Ex-Qualcomm Snapdragon chief turns CEO at AI chip startup MemryX

    Meet the new boss

    A former executive leading Qualcomm's Snapdragon computing platforms has departed the company to become CEO at an AI chip startup.

    Keith Kressin will lead product commercialization for MemryX, which was founded in 2019 and makes memory-intensive AI chiplets.

    The company is now out of stealth mode and will soon commercially ship its AI chips to non-tech customers. The company was testing early generations of its chips with industries including auto and robotics.

    Continue reading

Biting the hand that feeds IT © 1998–2021