Are you prepared for hurricane season? Disaster recovery and business continuity plan best practices

Despite the lull of a calm hurricane season so far, your enterprise’s disaster recovery/business continuity plan needs to be in place.

businessman on a rock in a stormy ocean
Thinkstock

We are two months into the 2018 hurricane season, June 1 through November 30. Last year, Hurricane Harvey’s damage was estimated to be $190 billion and Irma’s share was $100 billion according to AccuWeather’s economic cost estimates for the 2017 hurricane season. So far, this year is being predicted as a much calmer year, but let’s not be complacent and forget what we know we should do.

Whether you are a small, medium or Fortune 500 company you need a disaster recovery/business continuity plan (DR/BCP) in place. Remember, DR/BCP is for more than just hurricanes, natural disasters, terrorism, denial-of-service attacks and so much more. NIST, HiTrust and HIPAA look at DR/BCP from an audit or compliance controls prospective as follows:

In NIST: NIST Special Publication 800-34, Rev. 1, Contingency Planning Guide for Federal Information Systems, provides instructions, recommendations and considerations for federal information system contingency planning. Contingency planning refers to interim measures to recover information system services after a disruption. Interim measures may include relocation of information systems and operations to an alternate site, recovery of information system functions using alternate equipment, or performance of information system functions using manual methods. This guide addresses specific contingency planning recommendations for three platform types and provides strategies and techniques common to all systems.

In HiTRUST: Control Category 12 – business continuity management.

In HIPAA: HIPAA is a regulatory act, not a compliance framework. Notice the HIPAA contingency plan covers five implementation specifications below.

164.308(a)(7)(i) contingency plan

The purpose of contingency planning is to establish strategies for recovering access to EPHI should the organization experience an emergency or other occurrence, such as a power outage and/or disruption of critical business operations. The goal is to ensure that organizations have their EPHI available when it is needed. The HHS.gov contingency plan standard requires that covered entities:

“Establish (and implement as needed) policies and procedures for responding to an emergency or other occurrence (for example, fire, vandalism, system failure, and natural disaster) that damages systems that contain electronic protected health information.”

The contingency plan standard includes five implementation specifications: (1) data backup plan (required); (2) disaster recovery plan (required); (3) emergency mode operation plan (required); (4) testing and revision procedures (addressable); (5) applications and data criticality analysis (addressable).

164.308(a)(7)(ii)(B) disaster recovery plan

The HHS.gov disaster recovery plan implementation specification requires covered entities to: “Establish (and implement as needed) procedures to restore any loss of data.” Some covered entities may already have a general disaster plan that meets this requirement; however, each entity must review the current plan to ensure that it allows them to recover EPHI.

So, we have three different compliance areas above, NIST, HiTrust and HIPAA, that all have much in common. That is, they all require the ability to withstand business interruptions, planned or unplanned. To immediately resume business operations with little or no outages or delays. This includes denial-of-service attacks, hurricanes, fires, floods, sabotage, cyberattacks and more. They all require emergency modes and backups as well.

Using the above federal government’s National Institute of Standards & Technology standard as an example. NIST 800-34 Contingency Planning for Federal Information Systems lays it out like this for the federal sector, which translates to any business in the government of private sector:  

  1. Develop the contingency planning policy statement. A formal policy provides the authority and guidance necessary to develop an effective contingency plan.
  2. Conduct the business impact analysis (BIA). The BIA helps identify and prioritize information systems and components critical to supporting the organization’s mission/business processes. A template for developing the BIA is provided to assist the user.
  3. Identify preventive controls. Measures taken to reduce the effects of system disruptions can increase system availability and reduce contingency life cycle costs.
  4. Create contingency strategies. Thorough recovery strategies ensure that the system may be recovered quickly and effectively following a disruption.
  5. Develop an information system contingency plan. The contingency plan should contain detailed guidance and procedures for restoring a damaged system unique to the system’s security impact level and recovery requirements.
  6. Ensure plan testing, training and exercises. Testing validates recovery capabilities, whereas training prepares recovery personnel for plan activation and exercising the plan identifies planning gaps; combined, the activities improve plan effectiveness and overall organization preparedness.
  7. Ensure plan maintenance. The plan should be a living document that is updated regularly to remain current with system enhancements and organizational changes.

Let’s look at the above statements in more detail:

  1. Develop the contingency planning policy statement. This is where we define what is in scope, it might be “provide resilient online business operations for the FEMA, Federal Emergency Management Agency.” If there is no immediate threat facing our nation an outage might not be critical say in January, however in the time frame of June 1 through November 30 hurricane season, the criticality is very elevated. However, there are many other threats that don’t have any time stamp on them like sabotage, terrorism and Denial of Service attacks and if you live in tornado- or earthquake-prone areas then you must consider these threats. Remember a vulnerability alone is not a risk, it must be exploitable from a real threat. IE: If you live in Florida, your home may be unsafe in the event of an earth quake, but no such threat is present in Florida so the risk is low. The earthquake vulnerability must be defined as a specific threat to a specific target. This applies to any cyber threat as well. Not all vulnerabilities are exploitable or apply to all targets.
  2. Conduct the business impact analysis (BIA). Determine the business process and recovery criticality, identify outage impacts and estimated downtime, identify resource requirements, Identify system recovery priorities. This is where we must know our business risk appetite. Risk appetite can be defined as “the amount and type of risk that an organization is willing to take in order to meet their strategic objectives.” So, our risk management choices are: (1) avoid, (2) transfer, (3) mitigate, or (4) accept the risk.
  3. Identify preventive controls. Identify, implement and maintain controls.
  4. Create contingency strategies. Backup and recovery; consider FIPS 199 (Federal Information Processing Standard Publication 199—this is the Standards for Security Categorization of Federal Information and Information Systems of the United States Federal Government standard); identify roles and responsibilities; address alternate sites; identify equipment and cost considerations; integration into your system architecture.
  5. Develop an information system contingency plan. Simply document your recovery strategy.
  6. Ensure plan testing, training and exercises. Test, training and exercise activities.
  7. Ensure plan maintenance. Review and update plan; coordinate with internal/external organizations; document changes; manage distribution of the plan.

Putting this all together with a sample vendor DR/BCP template. Reference Sungard’s DR/BCP template. Sometimes you need the raw template to get started, I have found this resource to be very effective. This template covers the plan purpose, plan objective, plan scope, plan scenarios, plan assumptions and much more. The plan section covers the recovery strategy, recovery tasks, recovery personal, recovery timeline, critical vendors and RTOs, and critical equipment/resource requirements.

The template also includes an excellent flow chart that starts with the incident’s detection. Many vendors provide resources just like this example, use this one or search for more options online.

Conclusion

Risk is something we all face daily: our risk management choices are (1) avoid, (2) transfer, (3) mitigate or (4) accept the risk. But when a business has a reputation for delivering services to global or local customers, work stoppage is not an option or acceptable. At the very least it must be controlled and managed. So, whether it be a hurricane, a cyberattack, a flood or a fire risk needs to be managed. There are industry standards from NIST, HiTrust and HIPAA that will not only meet your compliance and audit requirements, they just may save lives and keep your business up in the most challenging of times.

This article is published as part of the IDG Contributor Network. Want to Join?

SUBSCRIBE! Get the best of CSO delivered to your email inbox.