Business Continuity Event Planning: Building a recovery strategy

In previous posts, we examined understanding the business, the relationship between event response and recovery efforts, and how to build an incident response plan.  The natural next step after initial response is the interim and permanent recovery of critical systems.  However, before drilling into the mechanics of creating and managing a business continuity plan for recovery, I’d like to step back and take a quick look at creating the controlling strategic framework upon which catastrophe response and recovery activities are based.

Approach

Having a management-approved business continuity strategy in place provides guidance relative to the requirements of initial response, what to recover, and to what extent it should be recovered.  Many organizations plan to recover everything, a recovery strategy doomed to fail in large organizations. 

Building a strategy begins with understanding the business.  Only with a thorough knowledge of what processes cannot be down for even a short period can you build an effective recovery plan.  Armed with operations management approval of these processes, and an understanding of the underlying technology, you can make an informed decision about what to temporarily recover at a recovery site. 

The approach I recommend is to:

  1. Work with business managers to identify critical processes.  Critical processes are those identified during the understand-the-business phase and ranked high when performing business impact analysis (BIA).
  2. Using the results of the BIA, and the time necessary to identify and prepare a permanent recovery site, identify those processes which must be part of the interim recovery activities (e.g., hot site).
  3. Work with business managers and key employees to identify technology requirements and possible manual workarounds.
  4. Document the results of Item 3 in a business recovery plan.
  5. Cycle through this process at least annually.

Considerations

Again, not all processes can be recovered.  This includes some critical outcome activities.   However, business continuity teams must provide accurate information to management to ensure the right decisions can be made as to whether to accept or mitigate the resulting risk.  According to BS 25999-1:2006 (Business continuity management code of practice, p. 21), managers should consider three things when assessing whether a process should be recovered and when:

  1. The maximum tolerable period of disruption of the critical process
  2. The costs of implementing a strategy or strategies for recovery or mitigation
  3. The consequence of inaction [defined in the BIA]

There are also logistics considerations when building a strategy.  It cannot be built in isolation.  What is and is not possible must be considered.  A strategy built on unachievable assumptions results in incident response and recovery plans with little or no chance of success.  Logistical considerations include:

  • Availability of key personnel.  If a recovery site is out of town, how will employees reach the site?  If a catastrophe encompasses a large geographic region, will employees even be available?
  • Premises.  Considering the list of critical processes, supporting technology, and manual workarounds, what are the office or data center requirements, including:

    • Space
    • Power
    • Cabling
    • Connection to the Internet
    • Direct connections to outside businesses/customers
    • Forms
    • Office equipment
  • IT infrastructure.  Entering into a contract for a warm or hot site requires considering what infrastructure is needed.  The cost of the contract increases with increases in infrastructure requirements.  When determining requirements, recovery teams must not only consider operational equipment.  They must also consider what equipment is initially necessary to concurrently recover critical systems, if necessary. 

There are additional considerations, but working through these provides answers about what type of recovery, if any at all, is feasible.

The final word

Whether a strategy is needed for smaller events (i.e., server failure, loss of key personnel) is up to management.  However, a strategy is necessary before planning for events resulting in loss of most or all data center capabilities.

Copyright © 2008 IDG Communications, Inc.

Make your voice heard. Share your experience in CSO's Security Priorities Study.