AI in cybersecurity: what works and what doesn't

Much of what we hear about artificial intelligence and machine learning in security products is steeped in marketing, making it hard to know what these tools actually do. Here's a clear-eyed look at the current state of AI & ML in security.

pros and cons compare thumbs up thumbs down evaluate balance
Getty Images

Let's start by dispelling the most common misconception: There is very little if any true artificial intelligence (AI) being incorporated within enterprise security software. The fact that the term comes up frequently is largely to do with marketing, and very little to do with the technology. Pure AI is about reproducing cognitive abilities.

That said, machine learning (ML), one of many subsets of artificial intelligence, is being baked into some security software. But even the term machine learning may be employed somewhat optimistically. Its use in security software today shares more in common with the rules-based "expert systems" of the 1980s and 1990s than it does with true AI. If you've ever used a Bayesian spam trap and trained it with thousands of known spam emails and thousands of known good emails, you have a glimmer of how machine learning works, if not the scale. In most cases, it's not capable of self-training, and requires human intervention, including programming, to update its training. There are so many variables in security, so many data points, that keeping its training current and therefore effective can be a challenge.

Machine learning, however, can be very effective when it is trained with a high volume of the data from the environment in which it will be used by people who know what they're doing. Although complex systems are possible, machine learning works better at more targeted tasks or sets of tasks rather than a wide-ranging mission.

One of machine learning's greater strengths is outlier detection, which is the basis of user and entity behavior analytics (UEBA), says Chris Kissel, IDC research director, global security products. "The short definition of what UEBA does," he adds, "is determining whether an activity emanating from or being received by a given device is anomalous." UEBA fits naturally into many major cybersecurity defensive activities.

When a machine learning system is trained thoroughly and well, in most cases you've defined the known good events. That lets your threat intelligence or security monitoring system focus on identifying anomalies. What happens when the system is trained by the vendor solely with its own generic data? Or it is trained with an insufficient volume of events? Or there are too many outliers that lack identification, which become part of a rising din of background noise? You may wind up with the bane of enterprise threat-detection software: an endless succession of false positives. If you're not training your machine-learning system on an ongoing basis, you won't get the real advantage that ML has to offer. And as time goes by, your system will become less effective.

Caveats aside, machine learning can streamline processes and advise SOC personnel. It shows the promise of what may be to come from more powerful AI-based systems. Here's where it's working now.

Top 9 uses of machine learning for enterprise security 

  1. Detecting and helping to thwart cyberattacks in progress.It's not going to close the door before the attack happens, at least not yet, but machine learning may find the indicators before humans would and then suggest possible actions. "We use machine learning to detect the degree of attack for unknown DDoS attacks," says Pascal Geenens, security researcher, Redware. "Machine learning also characterizes the attack traffic, and automatically generates signatures for blocking the attack."
  2. Threat intelligence. Machine learning excels at poring over mountains of data and categorizing the behaviors it finds. When it sees something out of the ordinary it can alert a human analyst. Machine learning is the block and tackle that becomes a force multiplier for screening huge pools of data much more quickly than humans could. Overload is a tactic commonly employed by the bad guys. It may be easier said than done, but threat detection systems become a lot more effective the closer they are to real time.
  3. Identifying, prioritizing and helping to remediate existing vulnerabilities should be a regular activity at all enterprises, but with a solid machine learning-based system at your disposal carrying out this activity every day, the biggest gotcha in enterprise security — unpatched vulnerabilities — might finally begin to be less of a concern.
  4. Security monitoring is the process of keeping up with information about network traffic, internal and external behaviors, data access and a wide range of other functions and activities. When properly programmed, a machine learning strength is being able to consume large pools of data looking for anomalies, so ML may be just the right technology to focus on juggling log files and error messages from a wide range of products.
  5. Detecting malware, including ransomware phishing attacks. Ransomware families are growing like Topsy. Machine learning may be the only tool available to us that isn't backward facing in the form of signatures that detect yesterday's ransomware. The ability to check for anomalous behaviors is being put to work chasing ransomware to good effect.
  6. Examining code for vulnerabilities. One of the mantras of DevSecOps is "security as code." It's clear that developers need to know how to code for security concerns, but machine learning can help automate that process by analyzing code for common loopholes and vulnerabilities that can be exploited. It might be a tool that could be used to teach uninitiated developers, in fact.
  7. Data categorization. To meet data privacy and data protection regulations, you need to know the characteristics of the data you're protecting. Machine learning can be harnessed to scan newly arriving data and classify it to levels of sensitivity, so that your systems can protect it in the ways it needs to be protected.
  8. Honeypots. There is one specific area where deep learning, something a little closer to true AI, can be used with automated mitigation today. "By deploying honeypots in enterprise networks around the internet, we are able to gather data that we can label as if malicious," Geenens says. "Every event or traffic instance detected by a honeypot is 100% malicious. If we have enough honeypots and data, deep neural networks can be used to create a model that can detect attacks with strong accuracy."
  9. Predict and adapt to future threats. Predictive security analytics is being worked on by a few companies. Predictive analytics shows some promise for business intelligence. Can similar machine learning technology be harnessed to project vulnerabilities and breaches in the future? The jury is out on this one.

Learn the truth

Some experts we consulted for this story denied there are any artificial intelligence-based products in the field. This may be somewhat disingenuous. AI can be used as an umbrella term to denote a broad set of technologies that includes machine learning among others that aren't technically artificial intelligence. Or it can be used in the strictest sense, referring to a computer system that has cognitive abilities. Niall Browne, CISO and SVP trust and security at Domo, is not a believer in today's "AI-based" security products. "AI has tremendous potential and will play a pivotal role in the future of security. However ... there are very few examples of AI being successively deployed in enterprise security." He does acknowledge that ML has security uses.

Browne is not alone in being tired of the AI hype surrounding some security products. His sentiments were echoed by several other sources.

Reg Harnish, CEO, GreyCastle Security sums up the mythology by saying "today, many software vendors that claim their products are AI enabled are using brute force to hardwire fixed rules rather than applying intelligence." What questions should CSOs/CISOs ask security vendors to avoid this type of lipstick-on-a-pig for machine learning?

"The primary question is: "How does it learn?' You need to understand the specific mechanisms implemented for training ML or AI," says Tom Koulopoulos, founder of Delphi and advisor to cloud storage startup, Wasabi. "What volume of data is required? How often is retraining needed? What's the mechanism for collaboration with the algorithm and its ranking by a human? Can the ML or AI work with archival data sets or just online data?" Koulopoulos adds.

Kayne McGladrey, IEEE member and director, Integral Partners, gave this advice: "Evaluate an AI-based security solution by standing up in a lab, alongside a replica of your environment. Then contract a reputable external red team to repeatedly attempt to breach the environment."

Parting thought

Even the sources most critical of the level of AI built into existing enterprise security platforms fully buy into the future of AI for security. There's a simple reason for that: cybercriminals are already using machine learning. "It's only a matter of time before AI becomes a reality across all industries, including cybercrime," says Browne. "Every time security has built a new protection, cybercriminals have developed a way around that defense. AI will exponentially accelerate this dance. Imagine a world where intelligent criminal systems try to break into banks, hospitals and energy companies 24x7x365. Of course, the AI systems at these institutions will counter with hundreds of intuitive moves per second in an attempt to keep cybercriminals at bay. This is the challenge and opportunity that AI will present in the future."

Copyright © 2018 IDG Communications, Inc.

How to choose a SIEM solution: 11 key features and considerations