Until very recently the network and security operations for AMERICAN SYSTEMS, a Virginia-based IT management and consulting firm, were two distinct and separate entities. But the company's CIO, Brian Neely, was looking for a way to centralize their IT tools and streamline event correlation, performance monitoring and security information management. Redundancies among engineers working in both centers also prompted AMERICAN SYSTEMS to look for more efficiency.
CSO spoke with Neely about the process of bringing their NOC and SOC together, and how other organizations considering convergence might learn from their experience. (For more on the topic, see Efficiency Through NOC/SOC Convergence.)
CSO: What was the status of your network operations and your security operations before you began your convergence efforts?
Brian Neely, CIO of AMERICAN SYSTEMS: We operated under a siloed approach and worked primarily with point solutions for security, performance and event monitoring. We have a relatively small staff and require all of our engineers to multi-task; meaning that the monitoring of network and security operations, and respective response, diagnosis, investigation and reporting functions, are typically performed by the same engineers.
Why did you want to make changes to a more converged approach?
In the end, it's about IT service reliability, integrity and protection—how do we remove barriers, extend controls and leverage processes to improve IT responsiveness and reduce costs and risks.
To obtain our IT objectives for improved security information management capabilities, event correlation, and performance monitoring, we needed to move forward with a single, integrated point-of-contact for all network and security events.
Also new challenges were emerging at an accelerating rate, from the sophisticated security threats, to increased demands from our business functions to deliver and manage better service levels. Therefore we needed to be more proactive at all fronts to ensure higher availability, improved security, and increased data confidence.
We wanted to centralize instrumentation and have the means to extend operational controls. This required a solution that would integrate IT governance, risk and compliance management functionality. With our audit requirements, we needed to advance to a solution where rules and reports could be mapped to our management frameworks (CobIT, ITIL) and compliance best practices (SOX).
Prior to convergence, had the company purposely separated security and network ops?
We had our security and network operations logically separated. I wanted our security people to have security as their primary focus, without distraction. Now they're co-located, and work hand-in-hand. Experience and knowledge is extended across three different tiers in the infrastructure services group. There have been no issues or negative feedback.
What do you call the new converged NOC and SOC?
It's called the Core Operations Center—not Network Operations or Security Operations.
What does it look like?
We have occupied our current facility for 11 years and are moving to newly designed facility in the first quarter of 2010. We are in a flat space, with hard-walled offices. In the new facility, engineers and security personnel will be grouped together for monitoring in case of data breaches, loss of power, system outages and data contamination (we work with classified data). It is designed to be a flexible, hi-tech workspace equipped with electronic interactive white boards, multiple real-time communication pathways, video cameras, multiple interactive displays and an integrated information "hub" to work dynamically and quickly, resolving any and all threats.
We are a heavy Microsoft shop and we are leveraging SharePoint to provide us Web-based access to a single portal from anywhere, anytime, like a traditional NOC. We use AccelOps' integrated monitoring, analytics and reporting, for both security and network operations. Also its business service instrumentation can complement the Core Operations Center.
Tell me more about the AccelOps tool. How does it work in your environment? Why did you chose it?
AccelOps intrigued us, as they offer all the security monitoring, alerting and reporting functionality—but then incorporated performance and configuration metrics. We liked that they came from Cisco realm. Their overall vision aligned with one of our IT objectives for tasking, event correlation and monitoring.
As mentioned earlier, AccelOps gives us a single console where we can get insight into many different aspects of our extended datacenter, the network devices, systems and applications that support it. AccelOps covers security, availability, performance and change management of the infrastructure, thereby giving us our NOC and SOC convergence.
AccelOps itself is a virtual appliance application running on VMware. My team had it up and working in a couple of days; most of the time used was for configuring our infrastructure to communicate with AccelOps. Within a week we had everything running—monitoring, alerting, tracking, reporting and defined business services. Because of the virtual appliance we can scale capacity as needed by adding more VMs or storage to the application.
What types of intelligence do you need to gather to accomplish quicker correlations?
AccelOps provides my team instant intelligence on our business posture, security threats and operational issues through its integrated, interactive and customizable web GUI. Now our engineers can more easily collaborate and get both high-level and detailed views into our network, systems, applications and user activity for a variety of purposes.
For example, we can validate and monitor the affect of an approved change such as a patch, which would support security and ITIL service management processes. Beyond built-in functions, the team could quickly build out rules, search on incidents, make new dashboards and generate reports.
We liked the fact that AccelOps data management was embedded and optimized for long-term access to all the collected data—all the raw and correlated data is online. If one product is seeing one pattern and another system reports another issue—most likely there is a critical event. There can be a variety of security, performance and availability events that may or may not be related—but could affect reliability or integrity. At the same time, monitoring and reporting from different products and sources is time consuming and difficult. Conventional approaches could take weeks to walk through or investigate events and log information as well as analyze network and system behavior. That approach is more reactive—the idea is to get to be proactive and efficient.
Explain what you have seen for benefits of the Core Operations Center. Have you experienced cost-savings? More efficiency in particular areas?
We anticipate it will yield a cost-savings to us. Being a defense contractor, we are money conscious and very fiscally conservative. We are an ESOP corporation, 100 percent employee owned. We started with AccelOps late last year when it was a start-up in beta test, and we subsequently became a charter customer. We not only saw the advantages that they delivered, but we were able to give input early in the beta process, which is great. Some of our requested features and implementation needs were baked in before their general release.
I think we'll need more time to ascertain cost-savings, in terms of value-add. We are seeing advantages of more integrated monitoring. This was seen early on when we had to find exactly who, where, what, and how someone changed certain permissions on a file share server. AccelOps' query capability searched through thousands of events with iterative filtering to quickly find the needle in the haystack.
We can monitor more proactively and respond more efficiently. We can now isolate a security incident and understand the severity of an attack or violation faster. The same is true for performance and availability problems.
More importantly we have an immediate understanding of the affect on IT services and severity by business impact. As we discussed earlier with the limited resources, I don't want to put more manpower into monitoring and reporting; monitoring and event aggregation is a time intensive, tedious task. It is nearly impossible to do by applying sheer manpower. We want to take advantage of, and put the burden on, the technology that is available to us. We are beginning to see immediate value in many areas, and there is significant functionality that we haven't even tapped into at this point in our implementation.
Any advice for other organizations who might be considering NOC and SOC convergence?
Do your homework, and look at the total value products provide across your organization, not just the cheapest solution or most conventional product, which can result in modest management gains or a limited view into your systems. While we used industry research, such as Gartner and others, to narrow the field and help us scope some of our base functionality, we relied on a very qualified staff to make an informed, effective, value-based decision that took into account our business and operational requirements.