System logs generated by servers and other various network apparatus can create data is in vast quantities, and sooner or later, attempts at managing such information in an off-the-cuff fashion is no longer viable.
Consequently, information systems managers are tasked with devising strategies for taming these volumes of log data to remain compliant with company IT policy, and also to gain holistic visibility across all IT systems deployed throughout the organization. With a tad of guidance and a bit of planning, the recipe for log management is actually straightforward, and the rewards are surprisingly favorable.
What is log management?
First and foremost, a definition of log management is in order. The National Institute for Standards and Technology (NIST) defines log management in Special Publication SP800-92 as: "the process for generating, transmitting, storing, analyzing, and disposing of computer security log data." As you probably knew that much already, what does log management really entail? Put simply, log management is defining what you need to log, how to log it, and how long to retain the information. This ultimately translates into requirements for hardware, software, and of course, policies.
Benefits of log management systems are abundant, and their return on investment is significant. To quantify the value of an investment in this area, it helps to view log management systems as business intelligence systems. Our business is of course information security, but many of the same features and benefits found in traditional BI systems are also present in log management systems. From data extraction, transforming, and loading (ETL) to even back-end enterprise data warehouses, all of the standard BI moving parts are also found in many log management systems.
The log management system may be a highly specialized business intelligence system in disguise, yet like its business-focused cousin, it brings game-changing benefits to the table. For example, day-to-day transactional data can finally be viewed across the organization as a whole rather than in discrete and disjointed silos. This ability to watch all systems simultaneously is a bit like being everywhere at once. As godly as it may sound, the reality is that this new set of virtual eyes increases your effectiveness without increasing your headcount. Amplified visibility into enterprise-wide events also equates to an increased awareness of real-time activity, which ultimately improves overall security posture by empowering staff to react quickly to malicious events.
Security professionals have long understood the benefits log management systems provide through the centralized storage of logs. Given that it's practically standard operating procedure for hackers to obfuscate their method of intrusion by destroying logs and disabling accounting mechanisms, having a protected and centralized copy of such data ensures that valuable information is preserved for post-mortem analysis, and that evidence is available for any follow-up legal action.
Interestingly, not all benefits of log management systems are security-centric. For instance, many network devices such as routers or firewalls have limited electronic buffers reserved for logging. Once those buffers reach capacity, older entries are discarded to accommodate more recent events. Devices hosted on busy circuits are sure to have high log volume, and discarding the majority of events simply isn't feasible for engineers tasked with troubleshooting operational issues. By forwarding logs to centralized systems that are packed with high-capacity disk drives, operations staff can access event data within time spans that yield adequate context to the issue at hand. Non-volatile storage of log data also opens the door to better infrastructure sizing projections through trend analysis. This allows managers to accurately gauge future growth patterns and perhaps justify budget requests.
For organizations required to comply with legal or regulatory reporting obligations such as those common in the Sarbanes Oxley or HIPAA Acts, properly implemented log management systems can advance the overall efficiency of compliance activities considerably. While many commercial log management solutions flagrantly tout compliance as a major selling point, the degree of variance in real-world auditing requirements and scope is vast, and particular audit controls are often company-specific. (See Jennifer Bayuk's 'Information system audit basics'.) Therefore, any notion of "compliance in a box" should be quickly disregarded as marketing hype.
However, assuming company-specific control audits may be properly massaged into the log management system, reports commonly requested by auditors can be canned into templates that are generated ad-hoc, perhaps even at the auditor's discretion through self-help facilities. Of course, the integrity of a specific log management system must be blessed by your official assessor, so obtaining prior auditor review should be an integral step in the log management system purchase criteria.
Choosing the right log management system
Before making any purchases, it's important to remember that not all log management systems are created equal. Regardless of whether you examine a specific product or a particular service hosted in the cloud, the variation in log management functionality can typically be simplified by ascertaining which category the system falls into. Fortunately, there are only two basic classes of log management system:
- the basic centralized log collector
- security information and event management or "SIEM"-style system.
Centralized logging servers are just that: no-frills systems designed to simply collect and consolidate logs from numerous sources for later consumption. Venerable contenders in this space are typically of the UNIX variety, and often sport open-source packages such as syslogd or Syslog-ng. Even modest hardware configurations coupled with open source software can handle considerable amounts of log data. However, do keep in mind that raw consolidation power is only part of the equation. Centralized logging servers lend themselves well to complex, hierarchical logging systems in which each component of the system must do one thing very well.
Conversely, if you need a system that can actually analyze data in order to extract meaningful information, then you are instead in the market for a SIEM.
SIEMs take event consolidation to the next level by providing not only event collection, but also aggregation, correlation, alerting, and reporting services. Event aggregation allows users to quickly ascertain how many events occurred without having to painstakingly count each and every event in detail. Correlation is essentially multidimensional analysis that pivots two or more categories of events against each other in order to yield high level information.
A typical example of SIEM correlation would be to automatically connect a series of brute-force login attempts to a sudden spike in network traffic from the host in question. This provides credibility, often in the form of a numeric "weight," to the assumption that one of those brute login attempts actually succeeded. Correlation provides tremendous value to humans staring at otherwise isolated events, and is particularly useful for identifying ominous activity occurring across numerous systems. Finally, alerting allows administrators to configure triggers which notify staff of anomalous or potentially threatening activity, and reporting ties everything together by summarizing events, trends, and incidents in various formats.
Also see 'Network security: The basics' by Stephen Northcutt
While open source SIEMS such as "OSSIM" are starting to make their debut in business environments, their gallant strides toward enterprise adoption are often met with stiff resistance in a market space otherwise dominated by mature, commercial offerings from companies such as Q1 Labs, RSA, and ArcSight. The overall landscape of the SIEM industry is well covered by analyst firms such as Gartner and Forrester, and there are plenty of freely available product reviews to peruse at one's leisure.
Upon examining the available offerings, it should become apparent that most SIEM solutions have a common baseline set of functionality, and follow a relatively standardized pattern of collecting and consolidating logs, transforming the logs into an internal (and almost always proprietary) format, then providing alerting and reporting services on top of this normalized information. Despite the fact that most SIEM offerings share a common set of functions, there are in fact differentiating features as well as strengths and weakness among the competitors. As form should follow function, having a thorough understanding of your organizational requirements will ensure you choose the right log management system.
Before shopping for a log management system, some basic planning on your part is in order. It is imperative to first survey the type of logs your infrastructure generates, and then determine what type of information you plan on gleaning from these event generating systems. This sounds deceptively simple, yet often times turns out to be quite challenging given the fact that even small organizations can easily generate over a dozen different logging formats such Syslog, Netflow, Windows Event Logs, and SNMP.
Adding to the fun are usually a handful proprietary formats as well, more often than not from the telecom side of the house. Wrangling this proverbial herd of cats together with a log management system requires one to think about the end product, and to keep in mind that log management systems typically normalize various data formats into a common schema. Rather than examining each type of log data in isolation, the end goal is to instead analyze a well orchestrated view of aggregate information across many different sources. Therefore, you should not only think about how a singular piece of log information is useful by itself, but also how it can enhance the usefulness of other information through correlation.
There's no need to go overboard here by hiring expensive data architects or database administration wizards. Simply think about how each component fits into the big picture and you'll be well on your way to a successful deployment.
With respect to log management system planning, one must take the overall volume of logs, as well as the geographical dispersion of systems into consideration. As mentioned earlier, log management systems are typically categorized into two different families: the basic log collector, and the SIEM.
However, these two classes are by no means mutually exclusive, and can in fact work in concert to build highly scalable systems capable of handling massive amounts of data from systems spread far and wide. Take a hypothetical company with a headquarter office located in Chicago and satellite offices in New York and California. Assuming each office generates a fair amount of system logs, one could deploy the potentially expensive commercial SIEM within the headquarters, yet strategically position free, open source log collectors/forwarders at the satellite offices. Within the California and New York offices, the chatty local traffic could be aggregated at the local server, which would then filter out noise and forward only important events upstream to the SIEM in Chicago. Rather than receiving logs from each and every server at the remote offices, the Chicago-based SIEM would only see events originating from two remote aggregation servers. This not only cuts down on extraneous network traffic, but may also reduce your license count dramatically.
SIEM solutions that natively support standard formats such as Syslog and Netflow are better equipped at fostering heterogeneous environments consisting of mixed open source and commercial components. Be on the lookout for vendor lock-in signs such as solutions that require proprietary agents to be installed on all systems, or systems that generally do not play well with open standards.
Reading the logs
Having examined the general benefits of log management systems as well as the planning methodology that goes into a successful log management system roll-out, one must take into consideration what resides at very heart of the system. I am of course referring to the logs themselves, and understanding their various formats and nuances is an integral factor in deciding which specific technologies to deploy. An in-depth analysis of specific log formats is well beyond the scope of this article, but suffice it to say certain log some formats are intended for human consumption, whereas others are more apt to machine parsing.