I recently got a question from the backup architect asking me about LeftHand snapshot techniques to make backups of VM’s. The current storage design classifies VM’s based on their Recovery Point Objective (RPO) per LUN as shown in the overview below.
Within this storage design it could be a good solution if we automate LeftHand Snapshots bases on the LUN classification. So for instance we could create a schedule that creates a LeftHand Snapshot every 4 hours for volume “VMFS_R4_01” and every 12 hours for volume “VMFS_R12_01”.
The most important question that arises is: “How consistent are the VM’s when we Snapshot their VMFS Datastore from LeftHand?”
Well, currently no integration exists between the LeftHand Snapshot technique and vCenter. If the LeftHand Snapshot process is started, vCenter isn’t alerted to quiesce the VM’s and therefore the VM’s are able to continue processing while a LeftHand Snapshot is made, which leads to inconsistent VM states. Last year the LeftHand roadmap indicated that vCenter application integration would be available in the new SAN/iQ 8.5. SAN/iQ 8.5 is currently shipped with the HP/LeftHand P4000 G2 nodes and will be available for download on 29th of March for existing P4000 user. For some reasons however vCenter application integration is shoved back to Q4 2010 or later.
So currently there’s only one way to get consistent VM’s on the VMFS Datastore which is by creating a custom script with the following actions:
- Quiesce the VM’s that reside on the VMFS Datastore that is going to be snapshotted; (This in fact means “Put the VM’s in snapshot mode”)
- Create a LeftHand Snapshot of the VMFS LUN;
- Commit VM snapshot to resume normal VM operation.
Drawbacks on a custom script are obviously administrative overhead and error prone. Please note that we didn’t choose to use this method since of the drawbacks.
When you do decide to create custom scripts please note the following consequences:
- If the VMFS Datastore is containing many VM’s, all the VM’s on the Datastore will have to enter snapshot mode which could cause an (depending on the workload of the VM’s) explosive data grow on the Datastore.
- After the LeftHand Snapshot is made, all the VM’s will have to commit their snapshot which could cause high SAN utilization depending on the size of the Snapshot delta files.
Now let’s take a closer look at the consistent states that we can have/create:
- Crash consistent VM
- File-system consistent VM
- Application consistent VM
The table below lists the quiescing mechanisms depending on the VM guest OS, ESX version and the availability of VMware Tools.
|Guest Operating System||ESX Server 3.5 U1 or Earlier and Corresponding VMware Tools||ESX Server 3.5 U2 or Later and Corresponding VMware Tools|
|Windows 2000 Server 32‐bitWindows XP 32‐bit||SYNC driver: File‐system consistent quiescing||SYNC driver: File‐system consistent quiescing|
|Windows Server 2003 32‐bit||SYNC driver: File‐system consistent quiescing||VMware VSS component: Application‐consistent quiescing|
|Windows Server 2003 64‐bit||Crash‐consistent quiescing||VMware VSS component: Application‐consistent quiescing|
|Windows Vista 32‐bit/64‐bit||Crash‐consistent quiescing||VMware VSS component: File‐system consistent quiescing|
|Windows Server 2008 32‐bit/64‐bit||Crash‐consistent quiescing||VMware VSS component: File‐system consistent quiescing|
|Other guest operating systems||Crash‐consistent quiescing||Crash‐consistent quiescing|
It’s important to note that currently only Windows Server 2003 is able to create application consistent snapshot’s (obvious only for VSS aware applications) and that Windows 2008 can only create File-system consistent snapshots.
Again, we decided to not use the custom script options since the drawbacks and wait until the vCenter application integration is available. In the meantime other backup mechanisms are validated specific for our design but that doesn’t mean that this could be a good solution in your specific case.