In Greek mythology, multi-headed Cerberus, the “Hound of Hades”, guarded the gates of the underworld to prevent the dead from leaving, with the multiple heads allowing him to look in several places at once. In our world, cyber security must play the role of Cerberus, protecting organizations from attackers who are determined to break in. But the Hound, despite his multiple heads, was a legacy security product, guarding but one entry point out of the underworld.

The modern enterprise network now encompasses public and private cloud and industrial environments, in addition to the familiar traditional network environments, so a modern security platform must be able to see much further than Cerberus to detect attacks in all the places where business is conducted. In addition to detecting known threats, a modern security platform must also be able to detect the non-obvious threats - i.e., “unknown” threats. In these kinds of attacks, cybercriminals use techniques that aren’t identifiable via signatures and rules (e.g., a phishing email that takes advantage of vulnerabilities in human behavior) to compromise an organization’s defenses and initiate slowly gestating multi-stage attacks.

The importance of reliable detections can’t be stressed enough. Legacy approaches to threat detection generate warnings on any suspicious activity. Security teams are understaffed and overworked and need to be confident that the alerts generated by their security products will not turn out to be false alarms. Products prone to excessive alerting and a lack of threat verifications overwhelm security team, contribute to analyst burnout and cause entire teams to jump ship, which is cause for alarm as over 1 million cyber security jobs remain unfilled.

The way to ensure reliable detection is with a hierarchy of expert systems that uses a range of indicators to rapidly analyze data using multiple analysis techniques and extract the maximum security-relevant information and context from it. A hallmark of expert systems is the ability to process hundreds of billions of records and petabytes of information daily. This is incredibly important, as business IP traffic is expected to grow at a CAGR of 18 percent over the next few years, starting from 16,399 petabytes per month in 2016 and reaching 32,165 petabytes per month in 2020. Modern security platforms must be able to rapidly analyze the huge volumes of network traffic that are flowing through enterprise networks now to detect threats in real-time, and also invisibly scale performance as traffic grows in the future.

Modern attacks can unfold over long periods, so the dimension of time must also be brought into the analysis. Analysts need a way to confirm the veracity of detections, so such a security platform must make it very easy for analysts to see what sequence of actions resulted in a security event in the case of an unknown attack or what rule or rules were triggered for a known one. It would be incredibly valuable for analysts to have easy access to the evidence (e.g., full fidelity forensics) adding additional context to the event. Access to that information on demand over equally long periods of time to mimic the breach detection window is not possible for most products.

At a minimum, the hierarchy of expert systems should include the following analysis techniques.

Intrusion Detection. Signatures and rules provide a deterministic way to identify known threats and there is no shortage of threat intelligence sources. At the same time, the uniqueness of each organization’s network can impact the effectiveness of threat intel from commercial and third party sources. The intrusion detection expert system must allow analysts to tailor the most meaningful components of commercially available feeds while also incorporating the intel that is uniquely valuable to their organization.

Signatures and rules provide a deterministic way to identify known threats and there is no shortage of threat intelligence sources. At the same time, the uniqueness of each organization’s network can impact the effectiveness of threat intel from commercial and third party sources. The intrusion detection expert system must allow analysts to tailor the most meaningful components of commercially available feeds while also incorporating the intel that is uniquely valuable to their organization. Machine learning. Machine learning provides a way to surface the “unknown” threats that can’t definitively be identified as such by other techniques. Though the results of machine learning are probabilistic, skillful feature selection and sufficient training data can ensure that machine learning models provide reliable detections. Broadly, there are two ways to train machine learning models.

○ Supervised learning which uses labeled training data to differentiate between security events and non-events. When it’s possible to find statistically high-quality training data, supervised learning provides a way to reliably identify threats that can’t be described by signatures and rules (e.g., attackers using DGA to generate numerous domain names that are used as communication points for C&C servers).

○ Unsupervised learning which learns to differentiate without having the labeled training data. For example, it can be used in anomaly detection: the algorithm learns what constitutes “normal behavior” in an organization’s environment (i.e., baselining) and then identifies anomalies based on whether deviations from normal behavior are statistically significant. In addition to being able to detect without labeled training data, another advantage of unsupervised learning is that it can be set up to continuously rebaseline behaviors, as what constitutes normal behavior may change over time.

Retrospective Analysis. As new information about how attackers are exploiting vulnerabilities becomes available, it would be incredibly valuable if organizations were automatically notified if that zero-day threat has ever impacted them. Key to this retrospective analysis capability is the ability of a modern security platform to provide long-term and affordable retention of full packet data. In addition to enabling discovery of any past exploit of a newly discovered vulnerability, long-term retention of PCAP data provides threat hunters with what they need to do their jobs efficiently and incident responders with the evidence to determine if an alert that’s being investigated is indeed real.

As new information about how attackers are exploiting vulnerabilities becomes available, it would be incredibly valuable if organizations were automatically notified if that zero-day threat has ever impacted them. Key to this retrospective analysis capability is the ability of a modern security platform to provide long-term and affordable retention of full packet data. In addition to enabling discovery of any past exploit of a newly discovered vulnerability, long-term retention of PCAP data provides threat hunters with what they need to do their jobs efficiently and incident responders with the evidence to determine if an alert that’s being investigated is indeed real. Heuristics. Heuristics are similar to intrusion detection rules in that they are deterministic rules but differ as they look at more loosely defined behaviors. They are very useful for discovering activities that are difficult to detect via signatures, in particular, those that do not have unique packet payload content. Heuristics should be implemented to retain state over long time periods, allowing them to watch for different behaviors unfolding across the large amount of data points (e.g., a successful exploitation progressing along the Cyber Kill Chain).

A hierarchy of expert systems ensures reliable detections. When a security event is generated, it’s either because a single expert system definitively identified an action as an attack (e.g., an intrusion detection rule finding malware based on a signature) or because of a consensus between multiple expert systems, each using an independent analysis technique. Using such a multifaceted approach encompasses both deterministic (e.g., signatures and rules) and probabilistic (e.g., machine learning) techniques, ensures that when security events are generated - whether for unknown or known attacks - they are indeed real, and not simply alarms about suspicious activities. Such an approach to detection ensures that security teams are focusing on the threats that matter to their organization and that they are not getting burned out by conducting pointless investigations.