Why Bayesian models excel at finding rogue insiders

One case often looks very different from the next, and it is precisely this complexity and behavioral variability that makes finding insider threats so tricky.

11 insider threat

Insider threat actors can cause harm to an organization in all kinds of ways, from intellectual property theft, financial fraud and data breaches to espionage, sabotage and even terrorism. Moreover, the root causes of their acts can range from malicious intent to willful negligence, and sometimes even just pure carelessness.

One case often looks very different from the next, and it is precisely this complexity and behavioral variability that makes finding insider threats so tricky. The IP thief may be motivated by greed, while the saboteur is driven by disgruntlement and the spy by a nation’s or competitor’s interest. When the motivations differ, their underlying risk indicators differ as well; just look at the yawning gap between indicators of malice vs. sloppiness.

Add it all up and, if you're responsible for InfoSec or PerSec at your organization, you have a daunting task on your hands. Detection is hard enough; prevention, the real holy grail, will be far harder still.

I've been a long-time proponent of the use of Bayesian models for solving an array of wicked security problems. In a two-part post in these pages early this year, I described first their capabilities and some common objections to their use and secondly their unique power in security analytics applications like insider threat mitigation.

With almost a year's worth of user complaints about excessive false positives and analyst overload it's even more clear today that Bayesian modeling works, whereas other security analytics approaches like machine learning and rules-based systems on their own may create new problems even while solving old ones. And certainly data-driven analytics tools that rely on feeds generated solely from network and device usage can't capture the kinds of psychological and behavioral factors so vital to insider threat detection or to similarly complex security challenges like account compromise.

This got me thinking: how well understood are the real-world benefits of Bayesian modeling? It turns out the answer is quite well — but mostly by specialists. A search of the scientific literature reveals any number of research studies where a Bayesian model that captured experts’ beliefs and inferred probabilities from them proved superior to other AI and analytics approaches. The studies are detailed — and their conclusions unequivocal: Bayesian models, properly built and applied, are terrific at predictively identifying risk from insider threats, which means they are good not just for detection but for prevention as well.

Take Pacific Northwest National Laboratory (PNNL). Seven years ago, it conducted an experiment using Bayesian models with the express purpose of finding malicious insider threats, because "any attempt to seriously address the insider threat, particularly through proactive means, must consider behavioral indicators in the workplace in addition to more traditional workstation monitoring methods." Exactly.

PNNL recruited human resources specialists and captured a list of 12 "psychosocial" behaviors that they judged to be highly indicative of future malicious insider risk — including disgruntlement, stress, anger-management issues, disregard for authority, confrontational behavior and lack of dependability. The 12 behavioral indicators were implemented as binary (i.e., true/false) random variable nodes in a Bayesian inference network, or model. Prior probabilities of each indicator, based on the subjective judgments of the specialists, were then assigned to each variable along with their relative weights. Finally, the experts determined the relative influence of each random variable on the risk output.

When PNNL asked a group of human evaluators to rate 24 employee cases on a 10-point risk scale from 'highest concern' to 'no concern, it found striking similarities between their consensus views as to the employees' likely future riskiness and results from the same employee data run through the Bayesian model. (For comparison purposes PNNL also ran the data through a linear regression, a feedforward artificial neural network and a counting model, but found the Bayesian model: (1) was better suited to working with missing data because it uses prior probabilities; (2) provided useful probability estimates where the other methods could not; and (3), at least in comparison to the neural network, was "more acceptable to users because it provides simpler explanations of why specific risks are assigned.")

In a paper PNNL published on the experiment, the authors write: "the ‘average’ risk predictions generated by a model representing these experts' consolidated wisdom is better than the prediction that an individual expert can provide due to possible information processing limitations, individual biases, or varying experiences. An expert system model also enables the automatic screening of staff members, which is consistent and independent of the experiences an individual human resources staff may have."

Furthermore, "We believe that if the developed model is incorporated to monitor employees with proper recording of the behavioral indicators, and combined with detection and classification of cyber data from employees’ computer/network use, the integrated system will empower a HR/cyber/insider threat team with enhanced situation awareness to facilitate the detection and prevention of insider crimes."

Nor is the benefit of the model-based insider threat approach simply that it can prevent potentially huge costs to an employer in financial, technological or reputational terms. Another benefit, PNNL says, is that it creates "a window of opportunity for dealing with the personnel problems affecting these subjects," thus "helping the employee before a bad situation turns worse."

This article is published as part of the IDG Contributor Network. Want to Join?

NEW! Download the Fall 2018 issue of Security Smart