Virtual Machines for Disaster Recovery Planning

by John Y. Arrasjid

Virtual machines (VMs), due to their encapsulation and standardization of virtual hardware components, enable quick recovery on a diverse set of hardware platforms. This means quicker time to recovery and a significant drop in cost for implementation of a disaster recovery (DR) plan.

In increasingly complex and heterogeneous server environments, the time and cost for disaster recovery planning has climbed. Virtualizing servers and storage increases flexibility, fault tolerance, and ease of recovery in the event of a disaster. This article covers the server virtualization aspects of lower costs, increased recovery time, and reduced system-administrator stress.

The Inherent Flexibility of Virtual Machine Technology

There are currently several implementations of virtual machine technology. This article addresses the VM technology that virtualizes the underlying hardware and provides a virtual machine monitor (VMM) to handle allocating resources. For this type of VM technology, each VM sees exactly the same hardware. This is true even if the underlying hardware of the system running the VMs changes.

What does this mean for disaster recovery? A guarantee that, should the VM power up in a different location (DR cold/hot site) and the hardware underlying the virtual machine server changes, the VM will need no configuration or plug-and-play changes. The only potential changes will be to IP addressing and subnetting.

Time to Recovery

Two components affect the time to recovery (TTR). The first is the setup of a virtual infrastructure at the hot/cold recovery site. This includes setup of the virtual infrastructure servers, storage devices (local/NAS/SAN), and networking.

The second is recovery after a disaster. This includes loading the VM disk images on the storage devices and the configurations on the virtual infrastructure servers. Typical restore time for a 10GB system using virtual disks, after loading the images on the storage devices, is less than five minutes. This window includes registration of the system and powering on the VM.

For this second component, the time to place the data on storage at the recovery site depends on several factors:

  1. The amount of data to recover.
  2. The location of saved data to recover. (Obviously, replicating data to the hot/cold site before a disaster strikes makes for the quickest recoveries. Recovering data from tapes is slower.)
  3. Network configuration complexity, including subnetting, routing and VLAN configurations.

For all three items, the amount of time is the same as for physical machines. After recovering the data, the virtualization technology actually speeds the recovery. Here are the typical recovery steps after data recovery ends:

  1. Create VM configuration file.
  2. Register VM.
  3. Power on VM.

With the exception of potential IP addressing and VLAN changes, there are no VM changes. You don't need to change Linux's lilo.conf or Microsoft's boot.ini registry settings when performing a DR process in a virtual infrastructure.

The quickest and costliest recovery method uses a snapshot or mirroring setup between your main data site and your recovery site. This typically takes place between two SAN systems. This is common at larger sites with multiple data centers in different geographic locations. One site snapshots or mirrors data from one corporate data site to the other and vice versa.

The Cost Savings of In-House Site DR

Many hot/cold sites will charge a fee based on the level of service that you require.

Typically, recovery sites use a "first come, first served" basis. This can mean an unacceptable wait time, depending on your needs. When you consider the level of service that will support your DR plan needs and the budget required for implementing with outsourced DR site solutions, your savings can be exponential -- even if you use outsourced DR sites.

SLAs for a Virtual Machine DR Plan

There are several factors common to service-level agreements (SLAs). Consider the following in relation to your virtual infrastructure:

  1. There's 100 percent recovery with the use of VM snapshots, as of the snapshot file date.
  2. There's no plug-and-play time, even if the base physical hardware changes. The VM sees no changes in the virtual hardware layer.
  3. Powering on each virtual machine is typically faster than for physical hardware because of shared memory-page technology and virtualized disk technology.

Existing Virtual Machine Technologies

There are several good virtual machine implementations available, including:

I work for VMware, so I'll present examples from my experience.

Pages: 1, 2

Next Pagearrow