We'd like to start this blog by thanking EMC for our recent inclusion in their funded OpenBench Labs VMware backup report. We are flattered that they've targeted and included us in their marketing campaigns but, more important, we welcome the increased customer attention and discussion around better ways to solve VMware protection.
In fact, we invite any customer to compare for themselves Commvault's single solution versus EMC's collection of point products, and Symantec's too while they're at it. We are well versed in showing customers modern ways to backup and recover their data while reducing impact to production systems, automating management and reducing costs. Here is what we consider to be best practices in protecting virtualized environments and what the OpenBench Lab report missed (something that neither Gartner nor Forrester overlooked in their latest software backup and recovery reports).
Reading the OpenBench Labs report, you would think there is only one way to protect any virtual environment — with Avamar and Changed Block Tracking (CBT) backup & restore over an IP network.
We could not disagree more. Just like physical environments, not all VMs are created equal and therefore need different protection and recovery approaches to align SLAs needs. Beyond that, I would also like to address the scalability differences of a Simpana software solution and discuss how automation is now becoming a necessity for delivering IT as a service.
In summary, here is what we mean by a "Smarter Approach":
- Leverage integrated, array-based snapshots for critical VMs and applications
- LAN-free backup with source-side deduplication and LAN-free recovery
- "Incremental forever" backup that only moves changes
- Automated workflow for disaster recovery
Critical Applications Require Hardware Snapshots
According to the OpenBench Labs report, "No attempt was made to leverage any of the advanced hardware capabilities of the VNX array to bias backup performance." Backup products that lack equivalent snapshot integration (like Avamar) and use legacy "streaming" methods are limited to one or two backup copies per night. While this is sufficient for some small, low activity VMs, for most other VMs, recovering to last night's backup copy is as good as critical data loss.
For large or critical workloads, Commvault recommends using Simpana Virtual Server Agent (VSA) with IntelliSnap functionality that leverages the snapshot capabilities within the storage array to create application-consistent, VMware-consistent and persistent hardware snapshot copies. Commvault (and EMC) customers can use Simpana IntelliSnap to manage EMC SnapView Snapshots or SnapView Clones from a single console without the need for scripts.
Why do we believe snapshots are critical for VM protection? There are really two reasons.
The first is obvious: they can provide multiple recovery points per day. For critical applications, relying on a single backup copy from the previous night is not practical. As an example, rolling back to last night's email backup when several hours of transactions have occurred since then is an unacceptable RPO for most enterprises.
Simpana IntelliSnap recovery capabilities include:
- Granular file level recovery from hardware snapshots (since we catalog them)
- Full VM recovery, in-place or out-of-place
- Full volume reverts to quickly revert all VMs in case of datastore corruption.
- Granular message level recovery using offline recovery tool as well as message level protection policy.
The second reason for a snapshot-based approach for critical or large VMs is reliability. Relying on the VADP software snapshot process, which does not natively leverage hardware snapshots, can create "orphaned" software snapshots when the reconciliation of the redo log phase takes too long. By leveraging a hardware snapshot engine as Simpana IntelliSnap with VSA does, the redo log phase is minimal – reducing the potential for orphaned snapshots and ensuring that VMs are protected at scale.
Other vendors might claim that CBT-based backup and recovery provides the same benefits as IntelliSnap with VSA. Here are a couple of reasons why that is not the case:
- CBT backup, while effective for smaller, moderate transaction VMs, is not ideal for larger VMs. This is because CBT backs up large portions of the VMs when small block changes are scattered across the VMDK.
- CBT restores only work when the VM is in the original datastore. If it has moved because of Storage vMotion or DRS, you have to restore the full VM. When you perform this restore over the network, as Avamar does, meeting RTO can be a problem.
Finally, Simpana software not only integrates EMC VNX snapshots, but it also supports storage arrays from NetApp, Dell, HP, IBM, HDS, Fujitsu and more. We continue to invest in expanding the IntelliSnap ecosystem and will have additional integrations to share later this year. The value to the customer is a single interface to normalize backup and recovery operations.
Best Practice: LAN-Free Backup with VADP
While IntelliSnap provides rapid RPO/RTO for the most critical VMs, the same Simpana VSA solution also helps provide efficient VADP-based protection for the VMs that do not have the same recovery needs.
VSA can be configured in two primary ways depending on customer need and infrastructure: LAN-free backup and Network backup. For the best performance and minimal impact to production, our recommended best practice is LAN-free backup, which is NOT how Simpana was tested by OpenBench Labs. To my knowledge, Avamar does not support the LAN-free approach described here.
The figure below depicts the recommended Simpana configuration when virtual machines are located on SAN-based datastores (Simpana also has the capability to perform LAN-free backup for datastores configured on NFS volumes).
The Simpana VSA and MediaAgent (the Simpana data mover) are both configured on a physical server running Windows Server with SAN access to the VMware datastores. During backup, the VSA reads the virtual machine disks (VMDKs) directly over the SAN, performs block level deduplication and sends the data to the MediaAgent module to be written to backup disk library.
This mode provides the fastest method of full or CBT-based incremental backups since data is not read over the LAN. In fact, data is never transferred over the IP network, except when the disk library itself is configured on a NAS device. More importantly, VM restores occur over the SAN as well, leading to much faster restore performance than when data is recovered over the IP network, as is the case with Avamar.
"Incremental Forever" with Deduplication
Because of the way Simpana software intelligently indexes data, each incremental backup represents a full system point in time. That means Simpana software customers can recover a full VM, or any individual file, from an incremental point in time in a single pass restore. There is no need to consolidate daily backups into a synthesized full backup, impacting production systems, as is the case with Avamar.
We complement this smarter approach with dedupe-aware synthetic full backups (aka DASH full). This was a key innovation in Simpana 9 because it means we don't have to rehydrate data. Simpana software simply updates the pointers on the deduplicated blocks that already exist on disk – a very fast operation that avoids impacting production systems (since the operation is performed by the MediaAgent). OpenBench and our competitors would have you believe this is a taxing process and one that needs to occur "every two weeks," which simply is not true. You can set your backup retention policy to run DASH full at whatever frequency you want in order to delete old data blocks that are no longer needed.
For more discussion on this topic, read Phil Curran's post.
A well-configured Simpana MediaAgent (node) with the dedupe data base hosted on SSD drives (4x SSD, SAS or SATA in RAID 5) can support clients totaling 60 TB of data and can hold up to 120 TB of unique (back-end) data. Each node can provide up to 2 TB/hr average backup throughput, although peak throughput is much higher. This rate depends, of course, on how fast data can be read from the production systems as well as how many clients are backing up concurrently.
Moreover, when you combine two Simpana nodes into partitioned dedupe grid, the resulting dedupe store can support 100-120 TB of client data and up 200-240 TB of backend unique data.
In contrast, according to Avamar documentation, a single Avamar node can only support up to 7.8 TB of back-end data. In other words, you need a fully stacked rack of Avamar nodes just to manage the same amount of data that can be managed by a single Simpana dedupe node.
In addition, a single Simpana VSA node can protect 30-40 TB of VM data. This enables you to protect very large environments at scale. For example, using 4 servers with the specified hardware configuration, we can provide end-to-end protection for up to 120 TB of VM data. This includes the ability to perform IntelliSnap operations if necessary.
That scale cannot be matched using Avamar. For large-scale VM protection, EMC's recommendation is to add a large Data Domain appliance and have Avamar backup to Data Domain. This ends up complicating the solution considerably and makes management and meeting SLAs more challenging.
Here are some comparisons:
|Maximum Back-End Capacity of a dedupe node
1 node with commodity storage
Maximum Front-End Capacity per Node
|60 TB, 50 concurrent streams
|Data Managed by a single VSA Proxy Server
||30-40 TB Front-End TB, including IntelliSnap
||Unknown, does not include IntelliSnap
Total number of servers/nodes required to protect 100-120 TB of VM data
|4 (with appropriate capacity storage)
Usually requires Data Domain at this scale
Tying it Together with Workflow Automation
While difficult to compare in a lab report, the final piece that Simpana 10 now delivers is a new, built-in workflow feature that allows customers to automate processes such as DR and DR testing. Our customers often leverage virtual infrastructure for DR operations, so the ability to automate snapshots, replication, mounting the snap at the DR site, restoring the changes and scripting the validation of the application copy using a drop-and-drag workflow is a very valuable and applicable component to this whole story. (Phil Curran also blogged about workflow automation.)
Of course, all of this is possible since we've already built these modern data protection features into a single software platform. Trying to lead R&D teams to make this all work from 4-5 different acquired products is not something I would want to manage or support. So I'm grateful I don't have to inflate R&D budgets in order to create value for customers and partners. I am also grateful for the opportunity that EMC has provided for us to tell our side of the story.