No, backups aren’t dying.
But in general, you really should be complementing your backups with snapshots and replicas. Here’s why:
Backups make a copy of the data and transform it (i.e., optimize/deduplicate/compress it) so that numerous versions of data can be efficiently stored for a long period. Unfortunately, when one turns “steak” into “hamburger” in that manner to optimize storage and retention, one must also reverse that meat grinder and restore the data into a steak again. Simply put, there’s going to be a tradeoff between optimum retention and recovery agility.
Snapshots typically reside on the same storage system as production data at a block-granular level. The good news is that there is no faster way than a snapshot to “get back” a previous version. The bad news is that anything affecting the storage array impacts not only the production copy, but also the snapshots. In addition, snapshots (because they live on expensive primary storage) tend to only retain data for days, not weeks, months, or years.
Replicas have similar near-immediate recovery agility as snapshots because they do not have to transform the data; they mirror it for rapid access. Replicas also aren’t vulnerable to the single point of failure that snapshots are. They are, by definition, replicas held on other platforms, similar to backups. But like snapshots, most replicas are only focused on near-real time and thus don’t provide a long range of previous versions.
Backups, snapshots, and replicas (along with other approaches) comprise what ESG refers to as the Data Protection Spectrum.
The ESG Data Protection Spectrum. Source: Enterprise Strategy Group, 2016.
You never see a rainbow with just three or four colors. Similarly, your data protection strategy should include all of the colors of the data protection spectrum unless some business justification truly prevents it.
But, there is a danger.
Using multiple mechanisms for data protection—i.e., backups, plus snapshots, plus replicas … or multiple niche products for backing up VMs, or databases, or SaaS platforms, or “everything else”—can lead to:
Overprotection. Overprotection occurs when multiple tools are protecting the same data set. Think of a vAdmin protecting a VM using a VM-backup tool, even as a database administrator is protecting the database itself. Each stakeholder is protecting the data without others being aware.
Underprotection. Underprotection occurs when various stakeholders assume one of the others is protecting the data, but really, no one is. The situation may remain undiscovered until something bad happens, and data is unrecoverable.
For many organizations, a solution can come in the form of what ESG refers to as The 5 Cs of Data Protection. It is a concept recognizing you probably have multiple kinds of “Containers” (disk/tape/cloud) and “Conduits” (backup/snapshotting/replication). But over time, those two Cs may become commodities. More important are the remaining three Cs:
The Control plane, as a policy engine, to ensure that everything is adequately protected.
The Catalogue to show you what you have and where all of the copies are.
The Console(s) to help you understand what you have, plus what is going on with it, and how well the protection processes are functioning. The Console enables to you interact with the Control layer and the Catalogue, which in turn makes it easier to oversee the various Containers and Conduits.
Commvault centralizes backups, snapshots and replicas—as well as the disk, tape, and cloud repositories—under a single control plane governed by a unified catalogue and a unified console. It’s one of the many things I like about the Commvault platform … aside from the fact that it also starts with “C.”
Jason Buffington (@JBuff) is the Principal Analyst at ESG focusing on all forms of data protection, preservation, and availability. He has actively deployed or consulted on data protection and storage solutions for 27 years, working at channel partners, various data protection software vendors and Microsoft.