AI + ML

This article is more than 1 year old

AI slurps, learns millions of passwords to work out which ones you may use next

Get creative – bringbackfirefly! will no longer cut it, nerds

Wed 20 Sep 2017 // 07:02 UTC

Eggheads have produced a machine-learning system that has studied millions of passwords used by folks online to work out other passphases people are likely to use.

These AI-guessed passwords could be used with today's tools to crack more hashed passwords, and log into more strangers' accounts on systems, than ever before.

When it comes to cracking a password, you typically start with a hashed version of the passphrase, stolen from a database or similar. Hashed means the password has been encrypted one-way: you can't unscramble it to get the original. Today's tools either brute-force their way through all possible combinations of words and letters (such as AAAAA, AAAAB, AAAAC etc) for a password, calculating a hash for each combo and comparing it to the stolen hash. If they match, there's your password. This is particularly intensive, especially if the hashes are individually salted.

Alternatively, as an optimized approach, a tool can take a dictionary of words and commonly used passwords – as well previously cracked passphrases – and turn them into hashes to check against the stolen hash or hashes.

But what if software could be trained to stay one step ahead and predict the passwords people are going to use, or using right now, based on what they've all done in the past?

A team at the Stevens Institute of Technology in New Jersey, USA, this month produced a paper [PDF] in which they detail how – using a generative adversarial network of two machine learning systems called PassGAN, which train each other – they were able to double the code-cracking skills of open-source tools HashCat and Jack the Ripper – and, more importantly, use this to protect against password-stealing attacks.

The researchers took their machine-learning system and fed it 32,603,388 plain-text passwords taken from the 2010 leak from music site RockYou, and let it work out the rules that people were using to generate their passphrases. It then attempted to use this knowledge to crack a hashed list of passwords taken during the 2016 LinkedIn intrusion.

At first, the AI correctly guessed 46.85 per cent of the RockYou passwords it was trained on – 2,774,269 out of 5,919,936 – and 11.53 per cent of the LinkedIn passwords – 4,996,980 out of 43,354,871. If you exclude from the correctly guessed LinkedIn passwords any passphrases it saw during the RockYou training, the number of correctly generated passwords drops to 3,890,043 or 9.582 per cent. In other words, the AI was able to crack one in ten hashed LinkedIn passwords it had never seen before.

It therefore outperformed John the Ripper, which was able to crack 6.37 per cent of the LinkedIn passwords (and 4.98 per cent of those excluded) and was behind HashCat, which cracked 22.9 per cent and 17.67 per cent respectively. When the neural network software was combined with HashCat, it fared better, as you'd expect, cracking 27 per cent and 22.039 per cent of the leaked account database, respectively. In other words, the AI and HashCat together could crack between one in five and one in four LinkedIn password hashes.

To achieve all this, the PassGAN had to come up with 528,834,530 passwords, HashCat generated 441,357,719, and John the Ripper also 528,834,530. The combined HashCat and AI produced 947,606,924 passphrases.

The team summarized their work thus:

Our experiments show that this approach is very promising. When we evaluated PassGAN on two large password datasets, we were able to outperform John the Ripper’s SpyderLab rules by a 2x factor, on average, and we were competitive with the best64 and gen2 rules from HashCat — our results were within a 2x factor from HashCat’s rules. More importantly, when we combined the output of PassGAN with the output of HashCat, we were able to match 18%-24% more passwords than HashCat alone. This is remarkable because it shows that PassGAN can generate a considerable number of passwords that are out of reach for current tools.

"Also, our evaluation of training performance suggests that, when supplied with a large enough leaked password set, the performance of PassGAN could surpass that of the best rule-based password generation techniques," they added.

In other words, HashCat is still good. And this early stage AI can fill in the gaps – until it overtakes the well-known tool. ®

More about

AI
Security

More about

AI
Security

Narrower topics

Narrower topics

Broader topics

Self-driving Car

TIP US OFF

Send us news

Topics

Special Features

Vendor Voice

Resources

AI + ML

AI slurps, learns millions of passwords to work out which ones you may use next

Get creative – bringbackfirefly! will no longer cut it, nerds

More about

More about

Narrower topics

Broader topics

More about

More about

More about

Narrower topics

Broader topics

TIP US OFF

Other stories you might like

Psst, hey. It's the NSA. You want some AI security advice?

Gentoo Linux tells AI-generated code contributions to fork off

OpenAI's GPT-4 can exploit real vulnerabilities by reading security advisories

Protecting distributed branch office environments from ransomware

AI spam is winning the battle against search engine quality

Google Cloud chief is really psyched about this AI thing

What's up with AI lately? Let's start with soaring costs, public anger, regulations...

UK and US to jointly develop AI test suites to tackle risks

Microsoft rolls out safety tools for Azure AI. Hint: More models

In the rush to build AI apps, please, please don't leave security behind

Microsoft warns deepfake election subversion is disturbingly easy

Don't rent out that container ship yet: CIOs and biz buyers view AI PCs with some caution

About Us

Our Websites

Your Privacy