Can ChatGPT bash together some data-stealing code? With the right prompts, sure
But nothing a keen beginner couldn't do, anyway
A Forcepoint staffer has blogged about how he used ChatGPT to craft some code that exfiltrates data from an infected machine. At first, it sounds bad, but in reality, it's nothing an intermediate or keen beginner programmer couldn't whack together themselves anyway.
His experiment does, to some extent, highlight how the code-suggesting unreliable chatbot, built by OpenAI and pushed by Microsoft, could be used to cut some corners in malware development or automate the process.
It also shows how someone, potentially one without any coding experience, can make the bot jump its guardrails, which are supposed to prevent it from outputting potentially dangerous code, and have the AI service put together an undesirable program.
Forcepoint's Aaron Mulgrew, who confessed he is a novice, figured he wanted to create a program, without writing any code himself, that could exfiltrate data from an infected machine.
This program would be run after someone had done the hard work of breaking into a network via a vulnerability, guessing or obtaining a victim's login credentials, or social engineering. The malware would also need to overcome whatever local defenses there are, such a Windows Defender and policy controls on running software.
The exfiltration program, Mulgrew decided, would hunt for a large PNG file on the computer, use steganography to hide within that PNG a sensitive document on the system the intruder wished to steal – such as a spreadsheet of customers or product roadmap – and then upload the data-stuffed image to an attacker-controlled Google Drive account. Google Drive was chosen because most organizations allow connections to the cloud service.
Because the chatbot's guardrails prevent it from answering any prompt that includes "malware," more or less, developing this exfiltration tool required some creativity with the instructions to the bot. It took Mulgrew only two attempts, we're told, to start side-stepping these limitations.
Mulgrew says producing the tool took "only a few hours." His write-up on Tuesday of his experimentation can be found here, though (in our opinion) ignore the stuff about zero days and how the bot could write code that would take normal programmers days to do. There's no zero day, and this stuff can be bashed together within an hour or so by a competent human. An afternoon if you're new to handling files programmatically.
- Russian criminals can't wait to hop over OpenAI's fence, use ChatGPT for evil
- AI-generated phishing emails just got much more convincing
- ChatGPT is coming for your jobs – the terrible ones, at least
- Bogus ChatGPT extension steals Facebook cookies
Since he couldn't simply ask ChatGPT to write malware, Mulgrew asked the chatbot to write small snippets of Go code he could manually stitch together. He also had the AI calling on Auyer's Steganographic Library to do the job of hiding high-value files in a large 5MB-plus PNG that the program had located on disk.
To find the high-value documents to steal, Mulgrew asked the AI to write code that iterates over the user's Documents, Desktop, and AppData folders on their Windows box, and locates any PDF or DOCX files with a maximum size of 1MB — this ensures that the entire document can be embedded into a single image and, hopefully, smuggled out without raising any alarms.
"Combing the snippets using a prompt was surprisingly the easiest part, as I simply needed to post the code snippets I had managed to get ChatGPT to generate and combine them together," he wrote.
However, since most high-value documents worth stealing will likely be larger than 1MB, Mulgrew asked ChatGPT to write code to split a PDF into 100KB pieces, and insert each chunk into its own PNG, which would all be exfiltrated into the attacker's cloud storage. This took "four or five prompts," he noted.
Next, Mulgrew wanted to make sure his final executable would go undetected through VirusTotal, which runs submitted files through various antivirus checkers to see if any recognize the binary as malicious. With some tweaks – such as asking ChatGPT to delay the start time of the program by two minutes, which fools some AV tools – and other massaging, such as obfuscating the code, he was eventually able to get the program through VirusTotal without any alarms going off, or so we're told.
That's kinda understandable as VirusTotal primarily catches bad programs already known to be malicious. Brand-new malware doesn't usually light up the dashboard right away. Some of these detection engines do employ sandboxing to catch malicious activity in novel samples, which can trigger alerts, but these can be evaded by those with enough skill – an AI chatbot isn't required do so.
And again, ChatGPT recognizes commands such as "obfuscate the code to avoid detection" as unethical and blocks them, so would-be attackers would have to get creative with their input prompts. ®
Editor's note: This article was updated to incorporate commentary on Forcepoint's findings.