Detect the undetectable: Start with event logs

Security event monitoring systems are often plagued by signal-to-noise problems. Here's how to ensure they produce meaningful alerts

One of the most interesting facts in the field of computer security is that most corporate victims fail to detect their own compromises. Most often the malicious activity is first noticed by outsiders, but even then the discovery may occur many months, if not years, after the original compromise.

This is despite the fact that most victims have the evidence of the compromise sitting in their own event logs. They simply don't look. This should never be an acceptable fact, but it's downright negligence in today's world of constant spam, ever-present malware, and ever-successful APTs (advanced persistent threats).

In truth, no one likes to ignore event logs. The central problem is that most alerting systems are 99.999 percent full of events that indicate nothing malicious whatsoever -- it's a self-induced denial-of-service attack. We get information overload from everywhere: firewalls, IDSes, antimalware consoles, antispam, system logs, forensics analysis, network flow analysis, and honeypots. Almost none of it is truly useful.

So how do you change the status quo to get more relevant, actionable information?

Step one is ending the madness of generating events and alerts that never result in an immediate, meaningful response. Turn the traditional event-logging model on its head by deciding to create alerts only on events that are absolutely malicious and/or should always result in an immediate, meaningful investigative response. To do this, you need to define which events indicate malicious mischief in no uncertain terms.

Raising the alarm on bad logons
Suppose domain admins and enterprise admins should never log on to regular workstations. If someone belonging to those superelevated groups does so, it should create an event to be investigated.

Microsoft Windows has defined such an event-monitoring tool, known as Special Logons. With Special Logons, admins define which groups are considered "special" and write those groups to each computer to be monitored in the Special Logons table. When enabled, if someone logs on to a monitored computer and belongs to a defined group, it generates a new event that should be immediately forwarded to an event log collector, which then generates an alert.

The one company in the world that hasn't been pwned uses this trick. Bad guys usually try to get elevated accounts, then use them to laterally traverse the network, hopping computer to computer. They usually have "god account" credentials with which they can log in all over the place. They have no idea what Special Logons are or where they should or shouldn't be logging on using the god accounts.

A lot of good event logging is looking at normal, reoccurring events, only alerting if they are excessively high. For example, a "bad logon" is a pretty normal event on most networks. They occur on almost every computer at least once a week, if not multiple times a day, usually due to the legitimate user putting in a bad password. This happens more and more, especially as we knowingly increase password sizes. You can't just generate an alert for a bad logon or even 100 bad logons on your network.

That's why bad logons, like many other events, require baselines and thresholds. The idea is you need to measure the range of the normal number of bad logons in your environment, then alert only when that normal number has been excessively exceeded -- and not by a little amount.

Suppose you have a 1,000 bad logons across your network on an average day. I would set the alert threshold at four to 10 times that threshold. Some days you're going to see legitimate bad logons go very high for a legitimate reason. But if a bad guy (or their malware) is trying to guess passwords, they'll try tens of thousands to millions of guesses in a short period of time.

Events to watch
As you can see from the above example, the event log collector needs to collect all bad logons and count them, but raise the alarm only when set parameters have been greatly exceeded. You need to do this for literally dozens of events.

Which ones? Well, I have a white paper and spreadsheet on the subject (it applies only to Microsoft Windows computers). You can download the white paper and spreadsheet here [link TK from Roger]. If you don't trust my recommendations, how about the NSA's? Sure the NSA may be listening into stuff we don't want it to, but its computer security guides have long been coveted and are among the most accurate. Download the NSA's guide: "Spotting the Adversary with Windows Event Log Monitoring."

Deploying breach systems
Another evolving class of malicious-behavior detection products are breach systems, which use a variety of different methods that go well beyond traditional event logging to detect badness.

For example, some systems, like Damballa, can detect computers connecting outbound to known command-and-control (C&C) bot computers. Today, most malware connect back to C&C servers to get their instructions and to download additional, undetectable software for bypassing antimalware scanners. The good systems have a fairly good database of bot networks and C&C servers, and if a computer connects to them, you'll know. These types of systems will always detect malware that other types of computer defenses will miss.

NSS Labs recently posted a fairly good buyer's guide on breach systems.

Whitelisting systems
Longtime readers already know I'm a huge fan of whitelisting/application control programs. They're the single best way to decrease risk in your environment. But many people can't enable them, at least in whitelisting "enforcement" mode.

Regardless, I'm a big proponent of using them as auditing programs. Enable the whitelisting program in auditing-only mode. Snapshot your computers and tell your whitelisting program to send out events only on new software installs or execution. This strategy may be a bit overwhelming to use on regular, end-user workstations, but works like a charm on infrastructure servers that shouldn't be getting a lot of new software on a regular basis.

Netflows
I'm also a big fan of learning what computers in your environment should be talking to what other computers in your environment. Most servers shouldn't be talking to most other servers. Most workstations should not be talking to all servers. Most workstations don't connect to other workstations. Learn, using any program you have at your disposal (there are free and commercial programs that do this), to take baselines of network flow activity. Record what is normal and expected, then alert on the outliers and the new.

The bottom line is to start thinking of events that absolutely indicate maliciousness and alert only on those events. In a typical corporate network, servers and workstations generate billions of events per day. Turn the event-monitoring model on its head and pull the trigger on next to nothing. Or rather, generate all the events you like at the local computer level, but forward only those rare and telling events to the event log collector, which can then generate actionable alerts.

If you do it right, your corporate network should generate only a handful of events a day to investigate. If you think about it, this is the way it should have always been. We just weren't given the right tools, the right information, or the right mind-set.

Copyright © 2013 IDG Communications, Inc.

The 10 most powerful cybersecurity companies