This article is more than 1 year old

Could you hack your bosses without hesitation, repetition or deviation? AI says: No

Sprinkling a little machine learning into bad behavior detection

Comment Businesses find themselves in a world where the threat to their networks often comes not simply from a compromise of their computers, servers, or infrastructure, but from legitimate, sanctioned users.

There is nothing new about the notion of cyber-attackers seeing human beings as their biggest target. For years, real-world attacks have repeatedly exploited ordinary user and powerful admin accounts to gain a foothold in a network. Usually, they have done this by tricking humans into handing over their credentials or running malware on their work PCs, or trying their luck at account brute forcing, or exploiting vulnerabilities, and similar techniques.

But attackers who are already on the inside of a network, abusing his or her credentials for nefarious intent without anyone the wiser are rapidly gaining notoriety.

In principle, it’s possible to secure, patch, and lock down devices from external attack. Crucially, it’s not any easier to guard against internal network users going rogue. Barriers and compartments can and should be put in place to limit access and any damage done, however, these may be circumvented via vulnerabilities, determined insiders, or managers demanding special access for their staff.

The default coping mechanism is assumption-based security — or hoping for the best. In other words, if a user authenticates using legitimate credentials, then the balance of probability is that they are who they say they are and should be trusted.

A web of technologies has grown up to mitigate this issue. Privilege management and enhanced authentication technologies are prime examples, but implementing these solutions creates a world of complexity for admins attempting to manage many different systems, each designed to close one aspect of the user problem.

Bad behavior

One alternative has been to monitor users for bad actions using conventional application and network logs, though attackers found blindspots in these systems, and exploited them to evade detection. The biggest weakness was the idea of defining what a user was and wasn’t allowed to do in terms of a set of static rules. The key word here is static. Some IT admins swear by their comprehensive and finely tuned static rules, proud that they detect all sorts of weird and wonderful malicious activity, and some admins are rather good at writing them. However, as external and internal miscreants grow more sophisticated, a more sophisticated means to detect bad behavior is perhaps needed.

Judge for yourself: in 2014, Gartner articulated a new concept of so-called user behavior analytics (UBA), later refined into user and entity behavior analytics (UEBA). Conceptually, UEBA is the ultimate expression of the collapse of the perimeter security model. It became obvious that the perimeter could be in thousands of places at once, around everything and anything – especially users and their accounts.

In a network where nothing is inherently trustworthy, a new measure would be needed that in UEBA would be fulfilled by the idea of the anomaly.

But what is an anomaly? In a perimeter network, a user who is breaking, or attempting to break static rules, certainly looks like an anomaly, but UEBA takes a slightly more sophisticated approach. It tries to understand users, accounts, devices, and applications based on their intention – a measure that is based in turn on building up a kind of profile of acceptable or standard behavior. This profile is part of what’s known as baselining.

Baselining: how deep can you go?

Baselining is the heart of this model of security. It forms an automatically adjusting definition of what is normal, and what is not normal and therefore unwanted, in terms of internal user activity. UEBA vendors implement this principle in different ways, but the essence is to turn network security into a problem of big data analytics where the raw material is sifted using automated machine learning.

Indicators from multiple monitoring systems are aggregated into databases that translate a mountain of data into something machines can process by applying a set of algorithms. Because the data has huge variety of characteristics, this is no mean task, but the center of gravity is always the user context – what is the user’s state and how might this affect security? Within this, there’s a spectrum of options from behavior to scenario-based analytics.

Baselining is not a new idea for security, but the addition of machine learning harnessed to big data has supercharged what seems to be possible by expanding the complexity of data input through which a baseline and any deviation from it can be understood.

Rather than imposing a set of rules or norms on networks, baselining analyses behaviour to define this normal state. This varies depending on the context. For example, the range of contacts a user will interact with through an email system and the nature of that communication will almost always be within certain limits.

Deviants detected

Baselining can be used to model this for each user or for groups of users. Once baselines are established, UEBA identifies sudden deviations from the pattern. Similarly, the applications and internal resources a user accesses will also fall within certain limits, which should mean that deviations such as the time of day or the IP of the machine from which the recourse is being accesses will stand out.

Using this approach, the job of the security admin no longer becomes defining what can and can’t be done but, rather, it becomes one of setting the point where a deviation from the baseline breaches an acceptable threshold and should turn into an alert.

You have to be careful, though. Set the threshold too low, and the chance of a false positive rises, but set it too high and an attack might be missed. In theory, UEBA baselining should make the chances of either less likely because the baseline covers a range of indicators, and not simply a single application or type of access.

What sets a good UEBA system apart? The depth and sophistication of baselining — something only UEBA-specific systems are capable of achieving. The lesson here is to beware of vendors that have simply applied the name to an older set of technologies, because you won’t get such depth. A second characteristic of a good UEBA system is the type of statistical modeling that informs the machine-learning algorithms, and the ability to evolve and cope with natural changes in the way networks and their users behave.

Of course, there’s a catch, and an obvious challenge with this model is devising thresholds that reflect different users in different contexts, particularly when trying to minimize insider attacks by privileged accounts. Frankly, there is no easy answer to this, although UEBA advocates argue that all malicious activity will offer giveaways, such as accessing valuable data in an unusual way.

Another challenge is the existence of temporary accounts and users who need to be given access to a network – such as, say, external contractors. UEBA offers a structure for applying machine learning and baselining to security, tho the real world remains a complex place.

businessman operating virtual hud interface and manipulating elements with robotic hand

We can rebuild him, we have the technology: AI will help security teams smack pesky anomalies


Paul Simmonds, chief exec of the Global Identity Foundation, whose career includes stints as CISO of AstraZeneca and ICI, explained the complexities of these challenges. “The problem we all face is that's is very difficult to apply UEBA to anything other than entities within our locus-of-control,” he told The Reg.

“Thus, you can make it work for your banking customers, or your employees, but have real problems expanding beyond that due to a lack of ability to understand people, devices, organizations, code and agents from outside.”

Simmonds also questions whether the phrase UEBA is always helpful: “What you actually are trying to understand is simply context – do we understand what the entity is trying to do? Contextual-based analytics would be more accurate.”

There is a final challenge that no UEBA system can solve on its own, and the answer to which really is down to you: the response. An alert is one thing, but what next? You need to process and respond to what you’re being told as rapidly as possible, which probably means acting in minutes to mitigate sophisticated attacks.

Some have proposed automated response as the next frontier for AI-driven security, but in most security operations centres, this will still come down to difficult choices made by men and women using their own judgment and a handbook of response tools.

The network perimeter has been compromised by attackers, threats, and risks on both sides of the firewall. Anticipating activity and actions considered out of the ordinary is a powerful new security model in this world of zero trust. UEBA isn’t a silver bullet – it comes with calibration requirements – but the use of machine learning to build a baseline is a far more intelligent approach compared to the old method of brittle rules for staying safe in this new and complex world. ®

More about


Send us news

Other stories you might like