Making metadata meaningful for network security

Done right, analytics applied to metadata can help find what you were looking for, and many things you weren’t.

01 Big Brother

Regulatory mandates govern data collection in many countries and some countries much more than others. Several European countries, such as France and Germany, have strong privacy laws that limit what data can be captured, where data is sent, where data is stored, and who can access that data. Note however that even in countries with strong privacy laws it doesn't necessarily mean that metadata cannot be collected. Multinational organizations that I've worked with utilize technological solutions to collect data automatically and convert it into metadata but they don't allow human analysis unless there is cause, such as a security incident.

Legal counsel should vet the solutions you decide upon. The solutions should also be inline with organizational policies. This may require updating employees regarding privacy expectations and general employee awareness surrounding the “how and why” of your data collection policy. Without these steps, metadata collection may be considered illegal or contrary to your organization's security policies. Simply put – get permission.

02 Invisibility

Encrypted network traffic is everywhere. Business communications, email, file transfers and even social media often utilize encrypted communication. Many organizations have stated that around 40 percent of their packets are encrypted. In addition to legitimate encrypted traffic, the growing majority of advanced threats use encryption, generally SSL, as a mechanism to bypass security controls, facilitate command and control activity and steal sensitive data.

High-level information can still be collected even when encryption is being used. For example, source and destination IP addresses and certificate information are all in the clear. But to provide practical value the information must be decrypted. Fortunately there are a number of network security solutions that are purpose-built for real-time decryption. These decryption solutions generally operate inline where encrypted data goes in one end and decrypted data comes out the other for analysis by whatever security solutions need visibility. In addition to purpose-built solutions, there are a number of firewall, proxy and related tools that offer decryption.

03 Suck the data up

I worked in the Security Information and Event Management (SIEM) space for years. We would talk about solutions with the ability to ingest logs, events, and alerts from disparate assets throughout the network. These systems could capture thousands of logs a second and are still an important part of network security.

Over the last few years Security Intelligence and Analytics (SIA) solutions -- often called SIEM for packets or big data security solutions -- have become common. Where once we discussed thousands of logs a second, we are now talking about the collection of millions of packets, flows and sessions a second. Because of the volume, velocity and variety of the packets, solutions designed to collect data off the wire at the packet level need to be able to operate with lossless collection on 2, 10, and even 40 gig networks. If you can't capture the packets and you can't re-assemble sessions and files, then you're not getting value from your metadata. High-speed packet collection is a prerequisite for any metadata strategy.

04 Stick the data in a box

Your packet collection is fast, efficient and effective. Now you have to store it. Metadata is usually used for a combination of real-time and forensic analysis. With millions of packets a second crossing the wire, the packets must be able to flow over the network and into the storage system with minimal latency while ensuring the data is usable.

Solutions that work well in this area ensure that data is indexed across a wide number of parameters for efficient retrieval. Because there are thousands of network applications, each with hundreds of attributes, it is important to leverage a solution that is extensible enough to store the packets, break them down into disparate pieces of metadata, and utilize indexing to make it useful after the fact.

05 Pull the data back out

We'll assume that your data collection and storage are humming along at this point. But now you want to access the storage system for metadata analysis. If you can't search the metadata quickly and derive results in a reasonable amount of time, its value proposition will rapidly diminish. There are a number of variables that relate to retrieval speed. Talk to the folks working in NOCs and SOCs and they'll tell you that as a rule of thumb, you need to be able to retrieve results across gigabytes of data in seconds and terabytes within minutes. Anything more than that simply becomes too cumbersome to be practical.

    Here are a few points to consider when thinking about your architecture and speed requirements for data retrieval:
  • What are the speeds of the networks I'm collecting data on and how utilized are they?
  • What type of tap solutions can I take advantage of?
  • Do I want visibility into encrypted traffic?
  • What are my raw packet storage needs; what are my metadata storage needs (metadata is generally stored longer and takes up less storage space than raw packets)?
  • Do I want to store files such as PDFs, word docs, images, videos, etc. locally or outside of my primary data store?
  • Will I have one or multiple data collection points; will they need to be centrally managed?
  • How many users will be active on my system?

This blog was part one to my discussion on metadata, and we've really only scratched the surface. Regardless of your metadata use cases -- including situational awareness, incident response, data loss monitoring, and advanced threat prevention -- the fundamentals outlined in this blog apply. What other strategies are your organizations embracing as they relate to metadata and network security?

[image credit: Flickr/xmacex (CC BY-SA 2.0)]

Copyright © 2013 IDG Communications, Inc.

How to choose a SIEM solution: 11 key features and considerations