Five facets of a disaster recovery plan you can’t afford to neglect

In Backups and Recovery by Brad Temby

Disaster Recovery Plan

A disaster recovery plan is what an organization should have in order to triage certain items on its network, such as systems, infrastructure, databases, and endpoints. Information disasters spread the spectrum across four different levels: Users, Systems, Infrastructure, and Fabric.

These four items are part of the five facets that should represent any disaster recovery plan. Without consideration of these facets, recovery from a disaster can become very complicated and expensive, not to mention damaging to your organization.

Share this Post

The Five Facets

All facets need to be considered when developing an effective disaster recovery plan. When a disaster hits, recovery will involve all five facets to some degree, whether it’s complete recovery, partial recovery, security policy adjustment, or internal company policy adjustment. Relationships between each facet have to be drawn up and interpreted as part of the plan. Those relationships would also need to specify priorities and specific objectives in order to be successful. Relationships and conclusions must be drawn among these facets, or areas, of every disaster recovery plan.

Facets of a Disaster Recovery Plan

 

#1: End Users

In this scenario, the end user loses all their data and access to systems. The data can be stored locally on the computer, on a file server, or through a SaaS platform such as Office 365, OneDrive, SharePoint online, Google Drive, etc. Typically this scenario only affects the user involved and rarely causes significant disruption to company functions. In the case of disaster recovery, restoring this part is usually the endgame of all efforts. It also proves to be the facet that, if ignored, prevents users from accessing the restored resources and may render all of the work and resources used for recovery useless.

#2: Systems

Here, a system such as an ERP, EMR or EHR for healthcare, eLearning, or other mission-critical system, has gone offline and is unavailable, including access to critical organization data. Examples would be a service that crashed, an application engine that crashed or hung, a part of the network has gone offline, etc. Failure can have a moderate to severe impact on a business, depending on the scope of resources provided by this system. In most cases, this facet is extremely critical to getting a business back on its feet.

#3: Infrastructure

This can be the most critical element of any disaster scenario. It can take many forms:

  • Catastrophic event (“smoking hole” scenario where the infrastructure has been completely wiped out physically – fire, flood, natural disaster, etc.)
  • External compromise – ransomware or other cyber-attack
  • Internal sabotage/failure

Even with good backups and backup policies, there is no platform to fall back on. In other words, you can have the absolute best backup and recovery system in the universe, but it won’t help you as there’s no target to perform restores.

IT infrastructure damage and recovery

Usually, the next steps involve getting a cloud platform to host critical systems in the interim until the infrastructure can be fully restored to its former status or similar.

Infrastructure includes physical server equipment such as virtual hosts, physical server chassis, storage appliances, clustered server resources, and the devices that networks would use for these items to communicate. This facet is not often obliterated in the real world but is nonetheless very important to the plan.

#4: Fabric

The very backbone of how devices interconnect to get the resources they need to fulfill daily operations, fabric includes network devices such as switches, routers, WLAN equipment and policies, firewalls, and internet circuits and connections. Fabric rarely gets touched, with the exception of the “smoking hole” scenario, sabotage/tampering, or serious network breach. Still, it should have some representation in the plan.

#5: Backups

This will include how servers, infrastructure, and systems are backed up, along with the backup strategy, repository locations, and the technology/platform being used. This facet is the most critical one in any plan, as any disaster without a plan for backup, backup storage, or retention could have apocalyptic effects on your organization.

Why do you need to plan for disaster recovery?

Outages are more expensive than you think, and the costs can range far beyond repairing or replacing the physical damage. These costs aren’t limited to the organization that suffered the attacks or breaches. Those affiliated with them are also affected, such as employees, students/patients, customers, suppliers, and other entities with a stake in the afflicted organization.

There are many instances where a breach, disaster, cyber-attack, or even a burglary cost organizations millions or even billions of dollars. A perfect example of this would be the cyber-attack on Maersk in 2017.

The Maersk Attack

In 2017, a cyber gang known as NotPetya infiltrated Maersk’s networks, dropping ransomware that crippled the company for weeks, even though their team fully recovered the infrastructure in 10 days. The attack not only affected computers but also compromised their systems and rendered their networks useless. After all was said and done, the initial cost of the cyber-attack stood at $300 million. However, this figure doesn’t include costs incurred by customers and suppliers. The attack shut down over 200,000 computers across 150 countries, thousands of applications and servers across 600 locations, disrupted operations across 76 ports worldwide, and almost 800 vessels. It also did unknown damage to Maersk’s reputation as the global leader in logistics.

Maersk Attack - Disaster Recovery Plan

Calculating the cost of a cyberattack

Cost of downtime

To assign a dollar value to what an outage would cost, you need to know your organization’s annual revenue stream. For non-profits, education, and government, most costs are intangible, sometimes impossible to estimate, and can be illustrated by the costs associated with staying open. This includes the cost of utilities, software licensing, payroll, travel, etc. These costs can also be associated with enterprises.

A basic formula for a minimal estimate as to how an outage would affect revenue streams is as follows:

Total annual revenue / number of working days in a year

Financial cost example:

  1. A 24/7 shop earns about $100 million a year. Dividing by 365 days gives you approximately $273,973 per day.
  2. Take that figure and divide it by the number of working hours in the day. $273,973 divided by a 24-hour working day is about $11,415 per hour.
  3. Taking that figure and dividing it by 60 for every minute in the hour gives you the cost for every minute the business is unable to operate. $11,415 per hour equals $190 per minute.

If compromised, attacked, or hit by a natural disaster, such a business could lose $190 of income EVERY MINUTE their systems are not operational.

Keep in mind this formula doesn’t account for expenditures and everyday business costs such as software subscriptions, utilities, travel, etc. For organizations that don’t have a revenue stream based on profit, it’s more difficult to project these types of costs due to where the revenue comes from and how it’s acquired. In those cases, the costs could be way higher and possibly immeasurable.

The DR plan must take these figures into account to effectively categorize the risk factor for the organization. Of course, mitigating security risks and vulnerabilities carries a cost in and of itself, while outside the natural scope of disaster recovery, and still needs to be addressed.

Some additional costs and potential hardships an organization will face would be staff burnout, leaked organizational data, and even data about payroll, customers, and social media presence. One huge cost would be damage to the organization’s reputation, which can be harmed irreparably. If any data gets leaked or stolen, there’s a very good chance that it’s been published and distributed throughout the deep web and the conventional web.

Cost of a disaster recovery service

Another cost to consider would be the services of an outside firm. For example, suppose a company gets hit with ransomware or any other catastrophe and has all of its computers and systems completely crippled. In that case, it may not have the resources needed on hand to mitigate it. Hence, they would reach out to a tech firm for assistance.

Typically, this would include the need for 1-5 consultants (or more) to make triage decisions and divide the share of the labor to get everything up and running ASAP. Hourly rates for these consultants can run at $145/hr per consultant or more. Chances are, off-hours operations would be necessary, so the hourly rate would increase. Plus, the firm may bill you for travel and possibly lodging, depending on the situation.

Below is a breakdown of the costs if it takes 40 hours to restore completely.  Spread that time across that number of consultants and adjust the hourly rate according to business hours and travel and expenses.

Recovery cost example:

In this case, you have five consultants working 16 business hours and 16 hours during off-hours time.

  1. Multiply five by 16 by the regular hourly rate: 5x16x145=$11,600.
  2. Now, consider the off-hours time assuming the price is $275/hr: 5x24x275=$33,000.

This comes out to a grand total of $44,600 of billable time plus expenses. However, depending on the firm and the nature of the task, the cost could get into the millions.

Legal costs and reputation damage

You also may incur legal fees for reasons ranging from recovery of lost revenue (more time spent) or from a proactive litigation standpoint if it has been discovered data was leaked or stolen. There’s the additional time and cost of investigations (more time) and remediation if it’s found to be a security vulnerability that led to the attack. Then there’s damage to your organization’s reputation and image, which is impossible to put a price on. After all is said and done, if there is proof of any breach, law enforcement should be involved at the start.

Reputation trampled in the mud.

Your reputation can be damaged in a number of ways. One way would be if any part of your organization’s staff members’ identities were leaked or stolen, the identities could be used to go after other organizations. For example, if any part of a C-Level staff member’s identity were leaked or stolen, numerous bad actors could use the stolen identity to try and phish others either within or outside the organization. They do this to either try and steal more information, get back into the network, or for money – tricking others into transferring money into phony accounts.

Conclusion

At the end of the day, no system is 100% impenetrable because if it was, then even the organization that has it would find they’ve created obstacles to daily use that users would find unacceptable.

The only thing that we can do is to limit the damage and keep the cost of mitigation under control with a practical disaster recovery plan. It’s very important to realize that there are not only tangible costs but also intangible costs, as well as variable price tags or ongoing costs.

If you’d like help improving your security operations or want to know more about our disaster recovery services, then please get in touch.

Share this Post

About the Author
Brad Temby

Brad Temby

Brad's experience spans the IT spectrum including building and rebuilding critical network infrastructure, performance enhancements and upgrades, and infrastructure design. He has participated in and facilitated recovery from cyber-attacks and ransomware as well as implemented preventive measures to prevent such attacks.

Related articles