New Data Protection For Amazon DynamoDB

By Lorne Oickle

Cloud has been a hot topic for awhile now, with discussions on the elasticity of computer services, the race to free cloud storage costs and hosted applications so that you don’t need to purchase and manage software. Businesses small and large have been working toward migrating to the cloud with the idea of cost savings and easing their reliance on upfront capital expenditures for hardware and software.

Cloud Platform as a Service (PaaS) has been gaining traction in these discussions. Cloud vendors like Amazon, Azure and Google offer extensive PaaS services that allow users to focus on developing and using the services instead of tweaking and patching software. The ability to scale as your application changes is one of the primary reasons to move to a managed solution. Although one area that a lot of users initially neglect is to determine best practices for the protection of their data. Typically, this is handled internally by the application developer or by a separate backup team. However, due to the nature of PaaS services, the underlying infrastructure is not accessible in the same way as an on-premises solution. What’s needed in these situations is a data protection solution that utilizes the cloud native systems in a way that are designed to protect the data.

Commvault has the broadest and most in-depth data protection solution for Cloud PaaS databases in the industry. As of the latest 11.19 Feature Release, Commvault is providing backup and recovery support for Amazon DynamoDB

First off, for those who aren’t aware of DynamoDB and what it is, I’ll run through a few quick points. It’s a fully managed NoSQL database service offered by Amazon and, since it’s PaaS, it’s completely serverless from the perspective of the database user. No software to install and no patches to schedule, which is not just a huge time saver, but also greatly reduces the security and vulnerability risks associated with unpatched software. We are seeing customers using Cassandra and MongoDB, migrating to DynamoDB for these reasons.

One of the most common questions I am asked is, “Why do I need a third-party data protection solution?” Amazon provides native protection of DynamoDB for a duration between 0 to 35 days through the use of snapshots. Most companies have data compliance policies they need to uphold, which are typically much longer than 35 days. Numerous companies are required to store data for as long as seven years, or even forever! The challenge with the snapshot solution is the automation of retention, deletion and replication. This all needs to be managed by the user through a manual process, a limited native solution, or complicated scripting.

The Commvault solution provides users the ability to perform a streaming granular backup of tables, a complete region, or multiple regions. In other words, the choice is yours to define your backups based on your requirements. For example, assign an aggressive backup policy to your production data with multiple backups per day and a more relaxed once a day backup policy to Dev/Test data.  Backups can be organized based on Tags and Rules if that’s your choice.

Typically, we recommend primary backups are stored in the same cloud region the source databases are running in. The secondary and tertiary copies can be stored in a different cloud region, another cloud, or even back to an on-premises data center. Which means backups are stored where you want and for as long as you want. Most companies have compliance policies that require backup copies are stored in at least two physically different locations. However, due to the nature of cloud outages, a copy of data usually must be stored in an alternative cloud vendor. Native cloud backup solutions only provide the ability to store data in the same cloud.

What good are backups without restores? From a restore perspective you have the option to restore individual tables, multiple tables, or all tables in a region. Out-of-place restores are even possible for those that want to restore to another cloud account or a different region.

As I mentioned, the native Amazon backup solution is through the use of snapshots, which means a snapshot of the database is taken and stored. The primary issue with snapshots is that they’re only useful for restoring back to the same database and cloud – nothing else. This means you’re paying for the storage of these snapshots in case you might need them.

The Commvault solution for DynamoDB is seen in the diagram above. A user defines protection parameters for specific databases or all databases and sets a retention policy to the backup data. The backup data is stored in the location of choice. The way we do this is through the use of native Amazon APIs. The DynamoDB tables are exported during the backup process and stored in S3 storage. Once Commvault has access to the stored data, we can create multiple copies in any location configured in your environment. The reverse is performed during a restore. The user chooses the point in time for the restore, the data is restored from the backup copy of choice and the data gets imported to an existing or new table using the native APIs. The flexibility in this design offers numerous options to be Cloud Ready.

AWS backup & management

Move, manage and use your data with Commvault on AWS.