Over the last 10 years investments made by Brazilian banks in cyber security have grown substantially. As has the data. Now, at least two camps are emerging in the debate about what data should be kept for various time intervals. Brazil has always been a pioneer in financial services. It was one of the first countries in the world to offer Internet banking, which also means it was one of the first countries in the world trying to mitigate the risks related to Internet banking. However, Brazil has an extreme shortage of security professionals. This coupled with sometimes-fragile infrastructure means they are used to getting through issues with “white knuckle” security – no time, no resources and the results are needed yesterday. While this model was never very good, it actually worked for some years. It didn’t work great, but it got the job somewhat done through sheer willpower. But today’s threats operate with greater stealth, speed, and sometimes sophistication compared to their predecessors, and as such require a better and more exhaustive approach. One element of this this new approach is big security data. Raw Data and Metadata Our conversations were primarily around raw network packet data and metadata that’s derived from these packets. When it comes to log management and SIEM solutions, collection is generally measured in the thousands of logs per second and retention in terms of quarters or even years. Regulatory mandates can play an important role in just how long these logs and alerts are kept. This isn’t necessarily the case with packet data. Packet data, and the artifacts like Microsoft Office documents, voice messages, videos, ISO images and the rest can eat through storage pretty quickly especially when instead of thousands of logs a second there are several million packets a second to be read, indexed, classified and stored. In order to have a sufficient volume of data to empower the security team, how much should be kept? The answer to packet data retention fluctuates dramatically based on factors such as budget, use cases, desired raw packet versus metadata retention levels, existing storage capacity, utilization of network pipes (for example a 10 gig link may only be 20% utilized) and other variables depending on the organization’s needs. There isn’t a “one-size fits all” approach to retention. More or Less Across the banks in Brazil there seemed to be two primary camps on the topic of data retention. Group one felt that packet data and related artifacts are most beneficial when used in a near real-time capacity with retention only stretching back about a week. In this group’s opinion the data starts to become less valuable and operationally useful very quickly and the value past a week doesn’t justify the storage cost. Generally this group felt that the less storage intensive metadata should and needs to be retained longer. For these banks insider threats are always a core use case and the idea is that insider threat analysis requires longer histories so that user trends can be analyzed. In this case the average retention of metadata was thought to live around six months with the packet data around seven days. The approach of group two was simply one of “more.” While they agreed that metadata retention should be longer that raw packet and artifact storage, they felt that group one’s approach was far too limiting for their use cases. Not only did they have the desire to conduct analysis in near real-time, but they also wanted to have the ability to conduct longer-term forensic analysis against a much larger window. Many of the uses cases for this need were derived from potential 0-days, APTs, etc. – so that for attacks they may not catch right away, they would like to be able to go back forensically to determine how they initially got in, when they got in, if they are still in, and in the case of data theft – what was taken? Most of these banks discussed packet retention in terms of quarters. They feel that they needed at least two quarters of raw packet data and at least two years of metadata. I’m curious to hear how other organizations are approaching the question of big security data storage – packets or otherwise. What are the use cases driving your retention requirements? Is it truly just a function of budget and the more dollars you have the more drives you buy, or are there more tangible, perhaps operational variables that are driving these decisions? Image credit: Flickr/Roger Wollstadt (CC BY-SA 2.0) Related content opinion Congrats - you’re the new CISO…now what You need foundational visibility into your security posture regarding what’s working and what’s not. By Brian Contos Mar 06, 2017 5 mins Technology Industry IT Strategy Cybercrime opinion Before you buy another cybersecurity buzzword Get value from what you’ve got before buying something new. Get rid of solutions that no longer add value and acquire new ones that are really needed with confidence. By Brian Contos Feb 21, 2017 2 mins RSA Conference IT Skills Network Security opinion What some cybersecurity vendors don’t want you to know When evaluating security products, you might be doing it wrong if you’re not incorporating assurance testing. By Brian Contos Feb 08, 2017 4 mins Technology Industry IT Skills Security opinion What football teaches us about cybersecurity You wouldn’t expect a football team that never practices to win the Super Bowl; but we expect wins every day from our cybersecurity professionals. By Brian Contos Feb 01, 2017 6 mins Technology Industry IT Jobs IT Skills Podcasts Videos Resources Events SUBSCRIBE TO OUR NEWSLETTER From our editors straight to your inbox Get started by entering your email address below. Please enter a valid email address Subscribe