Study: AI detects backdoor-unlocking DNA samples
The 4D chess equivalent of a supply-chain attack
How's this for a security threat? A backdoor hidden in lab software that is activated when fed a specially crafted digital DNA sample.
Typically, this backdoor would be introduced in a supply-chain attack, as we saw with the compromised SolarWinds monitoring tools. When the lab analysis software processes a digital sample of genetic material with the trigger encoded, the backdoor in the application activates: the trigger could include an IP address and network port to covertly connect to, or other instructions to carry out, allowing spies to snoop on and interfere with the DNA processing pipeline.
It could be used to infiltrate national health institutions, research organizations, and healthcare companies, because few have recognized the potential of biological matter as the carrier or trigger of malware. Just as you can use DNA in living bacteria to hold information, this storage can be weaponized against applications processing that data.
When you look at a typical sequencing process, the DNA strands go into a sequencer, which creates a digital file that the computer connected to the sequencer analyzes. As you can imagine, this is how you can introduce malicious but otherwise valid, sanitized data into a lab, via a sample sent in to process.
The University of Nebraska's Sasitharan Balasubramaniam, one of the leads behind a recent exploration of these vulnerabilities and what it means for the emerging field of bio-cybersecurity, has detailed this threat – and also ways it can be enhanced, and caught in time.
This isn't science fiction
Back in 2017, in one of the rare biosecurity research works focused on DNA sequencing, researchers at the University of Washington synthesized DNA so that when converted into a digital file and fed into an application, a security flaw was exploited to open a backdoor network connection. That research relied on a vulnerability being present in the code, either accidentally or deliberately introduced.
The new effort builds on that, and involves trojan-horse software, and a small and simple trigger in the DNA. "What's significant here in our work is that we looked at all the ways to hide this in the DNA and all the most efficient ways to do this so the code couldn't be found," Balasubramaniam explained.
"There's a concept in DNA research called steganography, which is used frequently in DNA coding. Using that we could hide this small bit of code very efficiently."
- Techniques to fool AI with hidden triggers are outpacing defenses – study
- Altered carbon: Boffins automate DNA storage with decent density – but lousy latency
- Cloud Atlas: Huawei's homegrown AI hardware hits shelves. Oh, and it's working on DNA storage
The good news is that using a deep-learning technique his team developed, it is possible to spot sneaky DNA manipulation. More on that is explained in the team's paper.
It's important to note the threat goes far beyond healthcare companies or national health services. At stake is not just the possibility of human patient data being manipulated once systems are compromised. Think of a large agricultural research company with massive volumes of genetic research.
"What we're saying here is that the impact is big: we need a rethink of how systems are secure, not only from the handling and storage of this data but how the data is sequenced and processed," Balasubramaniam said
He and team are not aware of this rethink happening in real organizations yet but the risk is pressing and requires new emphasis on biocybersecurity research. When The Register asked if sequencing companies were aware of this threat, Balasubramaniam said definitely not.
"We want to create awareness so that these companies aren't just thinking about anti-malware from a cyber-infrastructure standpoint, but from a bio-infrastructure one as well." ®