Legacy Migration Overview
Legacy backup data migrations are typically perceived to be high-cost, take a long time to complete, and prone to error due to the manual effort involved. For these reasons and many others, customers have chosen to avoid migrating their data, choosing instead to wait for the data to expire (essentially aging off on the legacy backup system), and move forward with all new data managed through Commvault® solutions. However, there are instances where keeping the data on a legacy system is not a viable option therefore data migration to the Commvault Data Platform becomes a necessity. These reasons are typically due to the following circumstances:
- Extended retention policies where customers do not want to maintain both systems for financial, risk, or labor overhead reasons
- The customer wants to take advantage of the Commvault Data Platform for both new and legacy data
- Avoidance of costs associated with extending licensing as well as support and maintenance contracts associated with maintaining legacy infrastructure environments
To address the data migration barriers of risk, cost, time, along with significant manual effort, Commvault has developed a solution to move data from legacy data protection software to the Commvault Data Platform. Commvault software interacts with the legacy platform to perform a data restore to a temporary storage location and then ingests the data by performing a backup and then as the final crucial component, Commvault software automatically marks the data with additional metadata which preserves the original backup time, retention schedule, and other key components of the legacy data. This whitepaper will describe in more detail the following:
- Industry trends requiring a cost-effective data migration solution
- An overview of the Commvault Workflow Engine
- A technical overview of the Commvault Data Platform with the use case of Veritas NetBackUp to Commvault migration
Unforeseen Vendor Lock-in with the Transition to Disk Backup and the Increased Need for a Cost-Effective Data Migration Solution
A shift in the backup and recovery industry in the last few years has seen the transition away from tape as a storage medium and the increased usage of disk-based targets using deduplication technologies, especially for long-term storage requirements. Why does that matter for switching from legacy data protections to new software? Well, when tape was the primary medium for long-term data storage, you only needed to keep a minimal footprint of the legacy backup system to perform restores, which was low-cost. Essentially, all that was needed was a server with legacy software installed and one or more tape drives to maintain access to all your data, while servicing the occasional restore.
With increased use of disk-based systems for long-term retention, you aren't able to keep a "small instance" running to access all the data. The main difference with disk-based versus tape-based systems is that the entire disk-based system needs to be operational in order to access all the data at all times as compared with a small tape installation. In general, keeping an entire disk-based system requires substantial maintenance costs, especially in years 5 and beyond. Customers can feel "locked-in" to the disk-based system with the potential for escalating maintenance costs, which can leave customers feeling that their hands are tied, and potentially taking a "do nothing" approach thereby staying with and accepting a sub-par solution.
The Commvault Data Migration offering is focused on substantially reducing labor cost and risk through automating the steps required to migrate the legacy data.
Cloud on Your Terms: Avoid Vendor Lock in and Take Control of Your Data
The good news is just because you want or need to change your cloud infrastructure, you don't have to put your company's data (or your job) at risk.
Overview of Commvault® Workflow Engine
The Commvault Workflow Engine enables rapid, reliable and repeatable automation of data protection operations. Workflow operations, including integration of 3rd party APIs, can be scheduled and executed from a command line or driven through a web process. This engine offers considerable flexibility for tailoring the end-user experience to the specific requirements of the business. Example workflow projects include:
- Billing system integration
- Application and Portal integration through C++/CLI/XML/REST/JAVA/HTML
- Object Link Deployment through REST
For the purposes of migration, Commvault has integrated with 3rd party legacy backup software to orchestrate restore of data, XML creation for metatags, and ingestion into the Commvault virtual repository.
Technical Deep Dive – Commvault® Data Migration through a Veritas NetBackUp Use Case
The following sections will describe how Commvault can migrate data from legacy software to Commvault using Veritas NetBackUp. The technical migration is separated into three distinct phases:
- Discovery – Gathering data from the legacy environment as inputs to the migration by executing legacy software vendor specific commands
- Data Export – Executing legacy software commands to restore data from the legacy environment to a target storage location (typically referred to as a 'landing zone')
- Data Ingestion – Creating custom XML for metadata and backing up the landing zone data into Commvault
Phase 1 - Discovery
Commvault needs to gather several key pieces of information in order to properly scope, estimate and plan the migration effort from the legacy backup environment. To gather the necessary information, Commvault will either request scripts or request that a series of commands to be run on the source legacy backup software.
These scripts and commands will obtain the following information:
- Number of backup jobs
- Number of clients / servers / VM
- Client names
- Client retention policy – 30 day, monthly, yearly, etc.
- Client original protection date
- Clients' backup sizes
- Backup types – full, incremental, etc.
- Backup agents utilized for protection – file, system, database, etc.
- Backup sources – tape, disk, etc.
Depending on the legacy backup software vendor, we may gather additional relevant data. In the Veritas NetBackUp example (Figure 1), Commvault gathers additional data pertaining to the DOMAIN construct, which will be required to perform a restore.
Figure 1. Example Discovery Commands – NetBackUp
Command Line Interface (CLI) commands to be executed on the NetBackUp Master Server
• Bpretlevel -U
• bpimagelist.exe -d “01/01/1970 00:00:00”
• bpflist.exe -backupid XYZ-### -M XYZ-Master -ut XYZ- -client XYZ-Client -policy XYZ-Policy -option GET_ALL_FILES -rl 999 > XYZ-Workspace\files_list.txt
Note: Any word reference that includes prefix “XYZ-“should be replaced with the environment specific name.
After the data is collected from the source legacy backup environment, Commvault's Professional Services team will be engaged to discuss migration options and requirements, for example:
- Does all of the legacy data need to be migrated or a just a defined subset (for example monthly full data or yearly full data only)?
- Estimated temporary storage requirements
- Defining in scope as well as out of scope data sets
- Schedule Pre-Migration Testing – the goal of this phase is to perform a migration of a small sample of data before conducting the full-scale production migration effort to:
- Predict production migration timeline by gathering and reviewing performance metrics from the legacy backup software environment
- Harvest performance data from the legacy backup software environment to predict the migration duration
Phase 2 & 3 - Data Export and Ingest
We purposely grouped the discussion of the 2nd and 3rd phases together to show the entire process through a series of execution steps. Below (Figure 2) is a diagram depicting the high-level process with 7 steps. For consistency we will describe the steps then show a NetBackUp example.
Figure 2. Migration Workflow
Migration Execution Steps
- Commvault Workflow communicates with the legacy backup software to initiate and orchestrate data movement to the landing zone. To initiate a migration, select the candidate jobs from the CVMigrationStatus report. Then click the 'Start Migration' button on the Dashboard.
- ClientName – virtual machine / server that Commvault will orchestrate data restore
- MasterServer – NetBackUp Master Server name
- Policy – NBU Backup Policy that the job belongs to
- BackupID – Unique backup id for NBU job
- BackupType – Full, Inc, Diff
- BackupTime – When the backup occurred
- DataSize – Size of Backup
- RetentionLabel – Name of the retention category
- Status – NetBackUp job status
- FailureReason – NetBackUp job failure reason
- CVJobID – Corresponding Job ID of CV Job
- CVJobstatus – Status of the migration from NBU Job to CV Job
- MigratedDataSize – Size of the data CV protected
- Commvault Workflow executes legacy backup software commands to restore data from the legacy backup software and target to the landing zone. The size of the landing zone is generally substantially less than the total amount of data to be restored (an exception is a situation where there are a small amount of jobs to be migrated, and they can all be accommodated within a single landing zone platform). Commvault will determine the landing zone size during the Pre-Migration Proving phase.
Factors that affect the size of the landing zone include:
- Largest and average backup job size
- Total amount of concurrent streams from legacy backup software
- Average restore throughput from legacy backup software
Figure 3. Legacy Backup Software Restore Commands – NetBackUp Restore Example
bprestore.exe -C XYZ-Client -D XYZ-StagingClient -S XYZ-NBUServer -f XYZ-Workspace\filelist_to_restore.txt -R XYZ-WorkSpaceDir\file_rename_list.txt -print_jobid -s “XYZ-STARTDATE” -e “XYZ-ENDDATE” -p XYZ-Policy -w
- Each restore job from the legacy backup software is written to the landing zone with a specific folder structure to reflect client name, backup date, and backup time. Example folder would be represented as: CLIENTNAME_BACKUPDATE_BACKUPTIME
- Commvault Migration Workflow generates metadata for each restoration job and creates an XML file for metadata to be imported into Commvault software. Associating custom per job metadata provides Commvault the ability to maintain original backup date, data expiration, and other properties from the legacy backup software. Without custom metadata, imported backup jobs would show current date as protected times and would be extremely labor intensive to search for historical data. Also, Commvault utilizes the metadata to set data expiration at a per job level automatically, which would otherwise be a manual, labor intensive, and in some legacy backup software environments, impossible to do. It also introduces a level of risk, which can be avoided by effective automation.
- Data will be imported into Commvault software with the ObjectStore client, which can be setup to run at regular intervals throughout the day. Once new data has been detected Commvault software will read the XML file and begin the ingestion process. After the data has been imported, Commvault will delete the data in the Landing Zone. This removes the need to perform any post migration scripting or manual clean-up jobs after each import operation.
- Data is indexed asynchronously and custom metadata processed while under Commvault protection.
- Once fully indexed data is searchable via Commvault's Global Smart Index and accessible through Commvault's Web Console. Example search queries could be:
- When was the original date of backup?
- Which data expires in 2 years?
- Where is client A's data from 3 years ago?
- Are there any excel spreadsheets from client A that are 4 years old?
Figure 4. Commvault Migration XML – NetBackUp Example
Figure 5: Search Capability
Historically organizations make software and hardware platform choices based upon a number of factors, but these can be typically summarized as technical capability, availability, integration capability, enterprise strategy and price. These decisions are made with the best of intentions and based upon data and information that is available at the time. However times change and with it so does technology and capability; and when we reflect upon the ever changing needs of the Business along with changing market conditions and industry trends, we find ourselves needing to adapt and evolve. Sometimes that evolution can be hampered and made complex when you have to consider moving and migrating data and workload from one platform to another.
Migrations from legacy backup platforms to Commvault can now be made simpler, faster, and more cost effective with less risk by leveraging Commvault's automated migration solution.
Organizations will no longer have to retain, maintain and manage a reduced footprint of their legacy software platforms purely for long term retention restores, nor will they have to wait around for their data to expire; you can plan to move to Commvault today and begin benefiting immediately from Commvault's industry leading enterprise data platform.
Organizations no longer have to feel "locked-in" or committed to a legacy backup vendor, as Commvault can help and support making the switch planned, predictable and with low deployment risk.