Closing the data floodgates

Using data loss prevention systems and techniques to stem the flow of critical data

I grew up in south Florida, probably one of the flattest places in the country. We had no mountains, hills or even mounds — nothing but flat in all directions. There was one diversion from the flat when I was a kid — an odd ravine along a residential street. We referred to it as the "deep deep" and drove by for a look every chance we got.

Over 30 years ago, I moved to Atlanta, a land of hills and valleys. My house backs up to a floodplain area with a ravine that makes the "deep deep" in Miami look small by comparison. Since I see it every day from my window, I really don't think much about it anymore.

So, what does this reminiscence have to do with preventing data loss? I would suggest that the underlying problem is the same. Companies concerned about losing key data, such as the elements regulated by HIPAA and PCI, begin watching their communication channels (email, USB drives, etc.) for the presence of such data, and filter out the critical items. It seems an easy task at first, but after the hundredth email message, their eyes glaze over, causing them to miss data items, just like me looking out my window, and no longer noticing my ravine. Thus, there is a legitimate need for some automated approach to monitoring communication channels for inappropriate data.

Data loss prevention (DLP), sometimes called data leak protection, is an attempt to monitor common communication channels for the presence of controlled data, and to mask the data, preventing its transmission in readable format.

According to the SANS Institute, the earliest DLP hit the market in 2006, with the product class gaining steam in 2007. In 2010, Fujitsu published a detailed research paper on the topic, laying out the specifics about how a DLP system needed to function. The market has grown into booming business since then, with products from a variety of sources, both commercial and open source.

From a high level, the idea is to employ automation to watch for the outflow (and in some cases, inflow) of controlled data by its pattern, for instance a Social Security or credit card number, via a communication channel. When such a pattern is found, the system can mask the data automatically to prevent its unauthorized disclosure, and also log the source for investigation. A key prerequisite to the loss prevention process is understanding what data elements you have, and where they are. According to Randy Trzeciak, the technical manager of the CERT Insider Threat Center at the Carnegie Mellon Software Engineering Institute. “If you don't know what they are and who has access, then it is hard to either detect or protect.”

Drilling down a bit, most of the products look for predefined data types, such as credit card numbers, as well as custom-defined patterns. They offer a variety of additional means to locate data to be masked, including keyword matching, predefined dictionaries and match expressions. They can integrate with a variety of other technologies, including Microsoft Active Directory, SMTP servers, databases, and custom code via API, and automatically discover and monitor workstations, giving the products a variety of vantage points from which to watch for data leakage.

So, why isn't everyone with a large volume of regulated data installing a DLP?  For one thing, price, with a three-year total cost of ownership ranging from $100,000 to over $1 million for large installations. The other downside is that they can provide a false sense of security for companies who assume that they can install such a system and then stop worrying. “That has finally fizzled from most minds,” says Anton Chuvakin, a research vice president at the Gartner Group, in an interview for SC Magazine, referring to the idea of worry-free installations.

If you are considering a DLP system, in addition to the basics mentioned above, the following are some features to look for, as recommended by the Info-Tech Research Group:

  • Ability to handle regulatory policies such as HIPAA and PCI out of the box.
  • Active Directory integration.
  • Some support for protection functions on mobile devices.
  • Centralized management and reporting.
  • Support for removable media.
pci security compliance Thinkstock

There are now a wide variety of products in the DLP market space, which can be stratified into the following tiers:

The big players

The major players in DLP market include Symantec, arguably the best known, and with McAfee. If you choose one of these, you will get a comprehensive and well-supported product, but you must be prepared to pay for it.

Mid-market players

This group of products is priced to be an option for smaller enterprises. Products include Trend Micro, which was the to- rated product for value in the Info-Tech Research Group study of the market, along with Palisade, recently acquired by Absolute Software.

Open source

For those with more time than money, there are open source DLP products available. These include OpenDLP, which has solid functionality, but with no code commits since 2012, is not getting a bunch of attention. Another option is snortdlp, which is built using the popular snort intrusion prevention system as a base. Finally, there is MyDLP, an open source product acquired by Comodo, with open source and supported versions available.

Cloud environment products

With the growing use of cloud-based services such as Box and Office 365, a class of DLP products focused on the cloud has emerged. These include Netskope and CloudLock.

If I can leave you with one piece of advice, it is that no DLP system is turnkey. They will all require considerable configuration and monitoring. If you think you can buy a DLP, install it, and end you data leakage concerns forever, you are in for an unpleasant surprise.

Copyright © 2015 IDG Communications, Inc.

The 10 most powerful cybersecurity companies