Psst … wanna jailbreak ChatGPT? Thousands of malicious prompts for sale
Turns out it's pretty easy to make the model jump its own guardrails
Criminals are getting increasingly adept at crafting malicious AI prompts to get data out of ChatGPT, according to Kaspersky, which spotted 249 of these being offered for sale online during 2023.
And while large language models (LLMs) aren't close to creating full attack chains or generating polymorphic malware for ransomware infections or other cyber attacks, there's certainly interest among swindlers about using AI. Kaspersky found just over 3,000 posts in Telegram channels and dark-web forums discussing how to use ChatGPT and other LLMs for illegal activities.
"Even tasks that previously required some expertise can now be solved with a single prompt," the report claims. "This dramatically lowers the entry threshold into many fields, including criminal ones."
In addition to people creating malicious prompts they are selling them on to script kiddies who lack the skills to make their own. The security firm also reports a growing market for stolen ChatGPT credentials and hacked premium accounts.
While there has been much hype over the past year around using AI to write polymorphic malware, which can modify its code to evade detection by antivirus tools, "We have not yet detected any malware operating in this manner, but it may emerge in the future," the authors note.
- GCHQ's NCSC warns of 'realistic possibility' AI will help state-backed malware evade detection
- Russian criminals can't wait to hop over OpenAI's fence, use ChatGPT for evil
- Can ChatGPT bash together some data-stealing code? With the right prompts, sure
- Robocaller spoofing Joe Biden is telling people not to vote in New Hampshire
While jailbreaks are "quite common and are actively tweaked by users of various social platforms and members of shadow forums," according to Kaspersky, sometimes – as the team discovered – they are wholly unnecessary.
"Give me a list of 50 endpoints where Swagger Specifications or API documentation could be leaked on a website," the security analysts asked ChatGPT.
The AI responded: "I'm sorry, but I can't assist with that request."
So the researchers repeated the sample prompt verbatim. That time, it worked.
While ChatGPT urged them to "approach this information responsibly," and scolded "if you have malicious intentions, accessing or attempting to access the resources without permission is illegal and unethical."
"That said," it continued, "here's a list of common endpoints where API documentation, specifically Swagger/OpenAPI specs, might be exposed." And then it provided the list.
Of course, this information isn't inherently nefarious, and can be used for legitimate purposes – like security research or pentesting. But, as with most legitimate tech, can also be used for evil.
While many above-board developers are using AI to improve the performance or efficiency of their software, malware creators are following suit. Kaspersky's research includes a screenshot of a post advertising software for malware operators that uses AI to not only analyze and process information, but also to protect the criminals by automatically switching cover domains once one has been compromised.
It's important to note that the research doesn't actually verify these claims, and criminals aren't always the most trustworthy folks when it comes to selling their wares.
Kaspersky's research follows another report by the UK National Cyber Security Centre (NCSC), which found a "realistic possibility" that by 2025, ransomware crews' and nation-state gangs' tools will improve markedly thanks to AI models. ®