[DISASTER RECOVERY] Disaster Preparedness: Creating a Plan

Before you can implement a disaster recovery strategy for your IT infrastructure, you've got to create an official plan. This critical document should detail every conceivable emergency that could reasonably befall your organization, pinpoint mission-critical applications and systems, and be signed off by all key figures in your organization—including executive management, human resources, and those responsible for facility management. This article will give you an outline for creating that plan.

Once you've met with key stakeholders and identified potential disaster scenarios, such as the loss of critical applications and data that could bring the organization to a standstill (and possible demise), your plan still has to be documented. It's important to have a concrete plan in a concise, written format to distribute to staff, so that no one is left in the dark when it comes to knowing what to do in the event of disaster. To guide you through creating the plan, here is a checklist of what an effective plan should contain.

The Checklist

• Identify mission-critical apps, systems, and platforms

You need to cut the meat from the fat when identifying which components of an infrastructure absolutely must be available in time of a disaster. This spotlights the importance of an up-to-date inventory assessment of hardware and software. Know every piece of software or hardware running in the infrastructure, including anything virtualized. It pays to not only invest in a good asset-management solution, but also to keep a log file on all software and updates. This way you not only know what the entire IT inventory is in case of loss from a disaster, but you can compile a list and check off which systems absolutely must remain operational during a crisis, and which you can live without temporarily.

Deliberate over what can be sacrificed in a disaster. For example, a database that's used to track sales leads may not be crucial in a disaster but, for a healthcare facility, a database listing all current patients is. Email may be needed to communicate with staff status updates and procedures, especially if employees are forced to remain off-site. Which components are important depends on the nature of the business, but, whatever they are, they should be listed and included in the plan.

• Assessment and Implementation

This is where you need to start thinking about implementation. What data can be accessed off-site without compromising security or corporate compliance? If an organization has never shifted any business processes to a cloud-computing model, this may be a good time to consider doing so. While line-of-business applications may require more planning, or they may be to complex to easily move to the cloud, e-mail and storage are good candidates for a move to the cloud.

Cloud-based Mail and Storage
Cloud-based email services are available that not only can mirror existing e-mail systems, but can also comply with HIPAA and other e-mail regulations where required. Many of these email providers can also implement data governance over email communications for a business such as a law firm, which may need to mark certain communications as confidential or highly sensitive or may need to ensure that only certain staff members receive certain email communications.

Cloud storage is a fast-growing trend with consumers, and businesses can also leverage the advantage of cloud storage as part of a disaster plan. An overwhelming number of organizations still have local backup solutions deployed, with data backed up to tape or RDX media. The backed-up data is often sent off-site and rotated regularly, so that a recent copy of an organization's data is readily available in case of system failures or a disaster.

However, having that data replicated to a cloud storage provider can save time that would otherwise be spent retrieving that data from a physical, off-site location and then manually restoring it to servers. With a cloud solution, critical data can be accessed in almost real-time—if employees have Internet connectivity. There are also cloud-storage providers that can ensure that stored data adheres to corporate compliances such as Sarbanes-Oxley (SOX).

Applications, Servers, and Virtualization
In outlining a disaster recovery plan, it pays to think not just about data moving to the cloud, but also any applications that could be moved. With providers such as Amazon, Rackspace, and Google, a business can transition applications and databases to the cloud so that access can be available in an emergency.

There are instances where a business can't completely back data up to the cloud, or at least can only implement a hybrid solution—with some data being backed up and other data remaining local. Reasons may include security concerns or cost prohibitions. In creating a DP plan, this is a good time to determine how an infrastructure can be streamlined.

In an emergency, the more disparate software deployed on more hardware, the more likely the case is for widespread damage and time involved in restoring systems. Virtualization can be a powerful solution for this kind of problem. Consolidating physical servers to virtual machines means IT can create regular snapshots of server instances and easily restore those servers after a disaster. With virtualization solutions offering features such as live migration, there needn't be a long period of downtime to restore critical infrastructure systems.

For organizations that still need to house most systems and data on-premise, a rolling mobile datacenter at a location determined to be safe in an emergency, can also be planned out. Backup servers that can replicate data from a main site to a backup site can at least provide a way to keep critical systems available.

Power
Besides data and servers, there are more basic considerations to deal with in disaster recovery preparedness. One of the most common disaster scenarios is a power outage—a disaster that every business should have plans for, as the nation's electrical infrastructure has generally not kept pace with growth. All critical hardware should, of course, be running on Unlimited Power Supplies (UPS). UPS solutions can provide a period of uptime in case of power failure long enough to at least get an organization to switch over to alternate disaster procedures. Regular checks and testing of UPS devices is critical.

For longer power outages, some organizations may also need to work with the facilities department to establish alternate power sources such as generators dedicated to IT equipment.

Telecoms and Remote Access
Internet providers and mobile carriers often experience extended downtime in disasters. While there is not much a business can do in the event of a serious disaster that may affect telecommunications in the immediate and surrounding areas, it's worth having redundant Internet connections from different ISPs. That way, if one ISP's network is down, a second ISP may still be online. A good disaster recovery plan documents how the infrastructure will fail over from one Internet connection to a second, redundant connection. The plan should outline regular testing of that failover connection.

The plan should also take into account how end users will access systems in an emergency. Many end users have company-issued or personal mobile devices that can be configured to remotely access the corporate network. Most organizations already have some sort of Virtual Private Network (VPN) solution in place, allowing remote entry into the business network. Does that VPN system actually work, and are non-technical employees adequately trained to use it without intensive IT support, which may not be available during a disaster? Can that VPN or remote access solution withstand a disaster? A backup to the VPN should also be considered. This may be a VPN server at another site, or access to data and systems through a cloud provider instead of the usual VPN system.

Some organizations may have a remote-access solution that will only grant a remote device access to the corporate network after scanning for certain required compliances. For example, a Windows client device that lacks a necessary service pack or antivirus definition file may be denied access to the corporate network. You don't want surprises like this in an emergency. As part of disaster preparedness, detail how and which client devices will access the network in an emergency. Regularly check those devices to ensure they can access the network. This is where an organization may want to consider a mobile device management (MDM) solution that allows centralized management of mobile devices accessing the corporate network.

The company may have issued smartphones to employees. If the company uses one particular carrier for employee phones, consider having phones available from a different carrier to distribute to key employees in an emergency. If one carrier's network is down in a disaster, another's may be available. Don't rely on the same one carrier for Internet and telecommunications in a disaster.

• Document

After a disaster recovery plan has been documented, make sure that all key executives, management and any other staff involved in disaster preparedness decision-making have reviewed and signed the document. This makes the document official policy and should be incorporated as part of the organization's policies.

The disaster recovery plan is a living document that should be regularly updated. If testing procedures is part of that document, the date and results of testing should be documented and associated with the disaster recovery plan.

In the next part of this series, we'll take a look at executing disaster recovery plans and at solutions that can help you provision your disaster recovery strategy.