Business Continuity Event Planning: Analysis and Containment

When a business continuity event (BCE) is detected, the first impulse is to jump and fix it as soon as possible.  In many cases, this might work fine.  However, the few times the jump-and-fix approach might cause more damage should be enough justification to pause first to analyze the event and notify stakeholders. 

Taking time to understand the nature of the event (i.e., failed server, malware attack, human intruder, code failure, etc.) and its scope also provides important input into containment planning and implementation.  So in this post I continue my examination of BCE response by moving from detection to preliminary analysis and containment.

Analysis and Notification

Analysis in this context is a transitional step--taking a moment to understand the BCE before proceeding with next steps in business continuity event management (BCEM).  See Business Continuity Event Management – An overview.

BCE analysis not only provides information about a specific problem process component (e.g., server, employee, router, switch, etc.).  It also examines ramifications of the failure, documenting all upstream or downstream services affected.  Using this data, the incident response team (IRT) can review the scope of the event and stakeholders affected.  Further, recovery of affected processes can be prioritized and an action plan developed.

Prior to an event, the IRT should create stakeholder call lists for each supported system.  These lists are part of the overall incident response plan (IRP).  Stakeholders might include:

  • Data owner
  • Process owner
  • Managers of employees using or feeding infected systems as well as users of system output
  • Public relations
  • Legal
  • Security
  • Help Desk
  • Facilities management
  • Labor union
  • Key customers

Representatives of some or all of these stakeholder groups might already be part of your IRT, as described in a previous post.  Once the scope of the event is understood, someone from the initial response team should begin notifications while the remaining IRT members move to containment.

Containment

Containment is the primary means of mitigating BCE business impact as well as risk to employees and customers.  The technology, people and processes needed to contain event consequences depend on several parameters identified during initial analysis steps, including:

  • Event type.  Containing a human or malware attack against a single network target requires a different approach from that used to deal with a broader malware infection.  Similarly, containment activities targeting a failed server differ significantly from those needed when an entire facility is lost or inaccessible.
  • Potential damage or theft.  In some cases, little containment is needed other than implementation of documented workarounds.  However, when escalating damage to critical infrastructure, theft of assets, loss of life, or other egregious outcomes are possible, more aggressive containment methods are necessary.
  • Need to preserve evidence.  If the event was caused or aggravated by human action, legal action might be appropriate.  In such cases, collecting and preserving evidence is often necessary.  Just remember that protection of human life and welfare trumps evidence preservation.
  • Information sensitivity.  It’s not always obvious in the heat of battle to remember that containment priorities across various systems or processes are not equal.  For example, protecting the availability and confidentiality of patient information is more important than mitigating impact on marketing systems.  Containment priorities should be set prior to an event taking place and included in the IRP. 

Document, Document, Document

Documenting activities is never popular, and it’s even less popular when responding to a BCE.  However, failure to track actions taken, actions pending, and other information about detection and containment activities can result in time-wasting, redundant effort.  It will also make an after action review less effective, leading the organization to make the same mistakes again and again.

Understanding what to document and in what format is part of BCEM preparation.  Some items to consider when creating record-keeping templates or guidelines include:

  • Description of BCE.  Answer who, what, when, where, how and why.  Include scope, processes and stakeholders affected, etc.  Start this document during analysis and notification.  Modify it as more information about the event is received.
  • Response  activities.  Each task performed by every team member must be recorded, along with the results of the task.  Record the task when assigned, recording a completion time when complete.  This helps track who is doing what and why.  Include:

    • Name of person who performed task
    • Date and time assigned
    • Name of person assigning the task
    • Completion date and time
    • Description
    • Reason for task, including intended outcome
    • Actual outcome
  • Chains of custody and other documents related to evidence collection and preservation.
  • General observations.  This category contains anything anyone deems important enough to talk about.  Information provided in this way is often an invaluable part of the after action review.  If you don’t write it down, chances are it will be forgotten.

The final word

Don’t jump into containment without stopping to look at the event you’re facing.  This does not mean, however, you should sit and discuss all aspects of the BCE for hours before actually doing something.  If you prepare properly for major types of events, including documenting processes and training response teams, analysis should be quick.  One way to achieve this is use of scenario planning.  Identify critical events most likely to occur and train to respond.  Training for worst case scenarios provides IRTs with the skills they need to also deal with smaller, similar events.

I can’t overemphasize the importance of documenting the response.  Well-managed documentation keeps the team on target, helps the team lead understand how resources are allocated, and provides information critical to improving how to respond to future events.

In the next installment in this series, we look at how to plan the transition from a response posture to one of recovery.

Copyright © 2008 IDG Communications, Inc.

Make your voice heard. Share your experience in CSO's Security Priorities Study.