Why Data Governance requires automation

BY Max Mortillaro


Personally, Identifiable Information (PII) has been collected by organizations for many years, whether by Internet behemoths or more traditional enterprises, for many different reasons. PII has been collected for years (if not decades) because companies thought to collect as much information as possible, under the premise that having a lot of information can only help businesses.

Today, however, regulations such as the General Data Protection Regulation (GDPR) and the California Consumer Privacy Act (CCPA) prohibit indiscriminate, large scale data collection about individuals. Regulators hold organizations accountable for data collection and data privacy and can impose hefty sanctions if regulations are breached.

These regulations protect consumers but have an impact on organizations, transforming data ownership into a potential tsunami of liabilities. 

How can organizations successfully implement a data governance strategy with a minimal impact on operations?

The unbearable ubiquity of data

Data is ubiquitous and present in many forms: emails, databases, spreadsheets and all kinds of files. It may reside on central systems as well as in user directories, SharePoint libraries, corporate laptops or in the unprotected S3 bucket of the newest organization’s cloud-based flagship project.

Ensuring adherence to regulations cannot be limited to specific applications or data platforms. It must be an organization-wide exercise that encompasses all of the IT systems and data sources, regardless of their location. Data governance is needed to achieve compliance.

Defining Data Governance policies

Data Governance is a joint effort between all parts of the organization to understand what kind of data is available, what should and should not be collected, and to classify collected data according to several criteria.

Relevant questions can be:

  • what data is captured?
  • what business purpose exists for collecting it?
  • relevancy/criticality of captured data
  • where is the data stored?
  • how long can it be retained?
  • are there legal requirements applying to certain data types?
  • is PII present?
  • can data be pruned upon a legal request (right to privacy) without affecting application data structures?
  • on the opposite, is there a requirement for long term archival and/or overrides to right to privacy?
  • is data adequately protected?
  • when does it have to be deleted?
  • how can it be deleted securely?

Obviously, this is only a limited subset of examples. The outcome of these inquiries is to define data management policies that can apply across the organization. 

Data Governance requires automation

Automation plays a crucial role in successfully implementing a data governance strategy. Organizations cannot afford to hire large teams of individuals dedicated to reviewing data manually. The cost would be prohibitive, the outcomes limited, and the amount of work required overwhelmingly arduous, if not straight out impossible.

Automated data analysis across infrastructure platforms, applications, file shares and unstructured data is the only rational approach to the challenges posed by Data Governance and applicable regulations. 

This approach involves defining clear data management policies. These policies are relevant for the analysis phase; they can also be used from implementing workflows and trigger automated handling, to minimize human involvement in high-volume activities.

Commvault Activate: Data Governance done right

Data Governance done right implies not only well-defined policies and collaboration between business units, but also reliance on a holistic data management platform that can bind different imperatives together.

Commvault Activate™ is Commvault’s response to some of the challenges outlined above.

Activate is an enterprise-class data management solution covering the following disciplines:

  • Data Discovery
  • Data Analytics/Reporting
  • Data Protection
  • Storage Management
  • Advanced Security Capabilities

From a management perspective, organizations can define policies within Activate to send alerts, create reports and even start taking action on identified policy breaches with automated activities.

Activate can also be used to handle storage aspects such as backup snapshots, identify obsolete data, define which data should be retained versus which is eligible for removal.

Talking about data retention, organizations can use Activate to define which data tier is best suited to store data, for example, on-premises or cloud storage. And when it comes to security, Activate can be used to limit access to critical data assets and implement granular management capabilities.

Usually, data management platforms require a lot of fine-tuning, but Activate comes out of the box with workflows and pre-built solutions accelerators to speed deployment. 


A solid data governance strategy should embrace legal aspects of data management, such as GDPR/CCPA, and be flexible enough for any future legislation changes. It should also be backed by a data management platform that empowers the organization with the ability to make choices and take decisions in an automated fashion.

Platforms such as Commvault activate help organizations:

  • Automate data discovery and categorization activities
  • Identify/manage compliance-related content
  • Identify potential risks and proactively mitigate them
  • Prune obsolete/irrelevant data before migrating data sets across tiers/clouds
  • Go beyond files and extend compliance enforcement across emails and unstructured data

Thanks to Commvault Activate, organizations can understand how data is used across the organization and act to be in compliance with legislation and make informed decisions about data governance.