Phuck off, phishers! JPMorgan Chase crafts AI to sniff out malware menacing staff networks

Machine-learning code predicts whether connections are legit or likely to result in a bad day for someone


JPMorgan Chase is integrating AI into its internal security systems to thwart malware infections within its own networks.

A formal paper [PDF] emitted this month by techies at the mega-bank describes how deep learning can be used to identify malicious activity, such as spyware on staff PCs attempting to connect to hackers' servers on the public internet. It can also finger URLs in received emails as suspicious. And it’s not just an academic exercise: some of these AI-based programs are already in production use within the financial giant.

The aim is, basically, to detect and neutralize malware that an employee may have accidentally installed on their workstation after, say, opening a booby-trapped attachment in a spear-phishing email. It can also block web-browser links that would lead the employee to a page that would attempt to install malware on their computer.

Neural networks can be trained to act as classifiers, and predict whether connections to the outside world are legit or fake: bogus connections may well be, for example, attempts by snoopware on an infected PC to reach the outside world, or a link to a drive-by-download site. These decisions are thus based on the URL or domain name used to open the connection. Specifically, long-short term memory networks (LSTM) used in the bank's AI software can predict if a particular URL or domain name is real or fake. The engineers trained theirs using a mixture of private and public datasets.

The public datasets included a list of real domains scraped from the top million websites as listed by Alexa; they also used 30 different Domain Generation Algorithms (DGA), typically used by malware, to spin up a million fake malicious domains. For the URL data, they took 300,000 benign URLs from the DMOZ Open Directory Project dataset and 267,418 phishing URLS from the Phishtank dataset. The researchers didn’t specify the proportion of data used for training, validation, and testing.

You may think just firewalling off and logging all network traffic from bank workers' PCs to the outside world would do the trick in catching naughty connections, though clearly JP Morgan doesn't mind its staff reading the likes of El Reg at lunch, and thus has turned to machine-learning to improve its network monitoring while allowing ongoing connections, it seems.

How it works

First, the string of characters in a particular URL or domain name to be checked are converted into vectors and fed into the LSTM as input. The model then spits out a number or probability that the URL or domain name is bogus.

AI

AI-powered IT security seems cool – until you clock miscreants wielding it too

READ MORE

The LSTM was able to a performance of 0.9956 (with one being the optimal result) when classifying phishing URLs and 91 per cent accuracy for DGA domains, with a 0.7 per cent false positive rate. AI is well adapted to discovering the common patterns and techniques used in malicious software, and can even be more effective than traditional URL and domain-name filters.

We asked the eggheads to describe what features the model learned when identifying whether something is benign or malicious, but they declined to comment. It’s probably things like typos in words or random snippets of characters and numbers jumbled together.

“Advanced Artificial Intelligence (AI) techniques, such as Deep learning, Graph analysis, play a more significant role in reducing the time and cost of manual feature engineering and discovering unknown patterns for Cyber security analysts,” the researchers said.

Next, they hope to experiment with other types of neural networks like convolutional neural networks and recurrent neural networks to clamp down on the spread of malware even further. Watch this space. ®

Similar topics


Other stories you might like

  • Photonic processor can classify millions of images faster than you can blink
    We ask again: Has science gone too far?

    Engineers at the University of Pennsylvania say they've developed a photonic deep neural network processor capable of analyzing billions of images every second with high accuracy using the power of light.

    It might sound like science fiction or some optical engineer's fever dream, but that's exactly what researchers at the American university's School of Engineering and Applied Sciences claim to have done in an article published in the journal Nature earlier this month.

    The standalone light-driven chip – this isn't another PCIe accelerator or coprocessor – handles data by simulating brain neurons that have been trained to recognize specific patterns. This is useful for a variety of applications including object detection, facial recognition, and audio transcription to name just a few.

    Continue reading
  • IBM AI boat to commemorate historic US Mayflower voyage finally lands… in Canada
    Nearly two years late and in the wrong country, we welcome our robot overlords

    IBM's self-sailing Mayflower Autonomous Ship (MAS) has finally crossed the Atlantic albeit more than a year and a half later than planned. Still, congratulations to the team.

    That said, MAS missed its target. Instead of arriving in Massachusetts – the US state home to Plymouth Rock where the 17th-century Mayflower landed – the latest in a long list of technical difficulties forced MAS to limp to Halifax in Nova Scotia, Canada. The 2,700-mile (4,400km) journey from Plymouth, UK, came to an end on Sunday.

    The 50ft (15m) trimaran is powered by solar energy, with diesel backup, and said to be able to reach a speed of 10 knots (18.5km/h or 11.5mph) using electric motors. This computer-controlled ship is steered by software that takes data in real time from six cameras and 50 sensors. This application was trained using IBM's PowerAI Vision technology and Power servers, we're told.

    Continue reading
  • Train once, run anywhere, almost: Qualcomm's drive to bring AI to its phone, PC chips
    Software toolkit offered to save developers time, effort, battery power

    Qualcomm knows that if it wants developers to build and optimize AI applications across its portfolio of silicon, the Snapdragon giant needs to make the experience simpler and, ideally, better than what its rivals have been cooking up in the software stack department.

    That's why on Wednesday the fabless chip designer introduced what it's calling the Qualcomm AI Stack, which aims to, among other things, let developers take AI models they've developed for one device type, let's say smartphones, and easily adapt them for another, like PCs. This stack is only for devices powered by Qualcomm's system-on-chips, be they in laptops, cellphones, car entertainment, or something else.

    While Qualcomm is best known for its mobile Arm-based Snapdragon chips that power many Android phones, the chip house is hoping to grow into other markets, such as personal computers, the Internet of Things, and automotive. This expansion means Qualcomm is competing with the likes of Apple, Intel, Nvidia, AMD, and others, on a much larger battlefield.

    Continue reading

Biting the hand that feeds IT © 1998–2022