The Dark Side of Data—and What to Do About It

By Don Foster

In my last blog post, I discussed no workload left behind; today, I’m going to map out how to find dark data and what to do with it.

Unstructured data makes up 80% or more of enterprise data, and is growing at the rate of 55% and 65% per year according to Datamation. According to TechJury, 95% of businesses cite the need to manage unstructured data as a problem for their organization. The interesting—and challenging—thing about unstructured data going into a data lake is knowing what’s out there, the integrity of the data and the risks it poses. This leads to higher costs, higher risk and the potential for a security breach. It’s a critical idea that not enough organizations have looked into, primarily because customers face these questions and may not have the answers:

  • Do you know where all of your data is being stored?
  • What’s in it?
  • Is it properly secured with an appropriate level of risk for access?

Anyone who says yes to these questions is kidding themselves, or doesn’t understand the reality of what’s occurred within IT with digital transformation, shadow IT and IT everywhere. Dark data is a challenge, but it’s a journey where you’re constantly going to adapt with the business as it adjusts to risk and security profiles and threat landscapes. To find and manage dark data, you need unmatched indexing of all workloads to know your data, understand the risks and take action—and you need Intelligent Data Services to enable that.

Risk profiling is a key part of managing dark data. Along with finding out where the information is stored, customers can decide how best to address it with an effective risk profile, which includes knowing: where the information is, whether it’s structured or unstructured, and the appropriate execution plan to better handle data in the environment. This shines a light on dark data.

Creating a risk profile is the first step toward effective data management. Think about it this way—you clean out the junk in your closet before you move the good stuff to a new location instead of just loading it all in a U-Haul and taking it with you. The same thinking applies to data; you don’t want to spend time or money managing junk when good data is such a valuable asset. According to TechJury, poor data quality costs the US economy up to $3.1 trillion every year.

And here’s where we look back at my first blog post, where I talk about being data ready. There are many types of risk associated with not effectively managing data such as personal information, private company data, intellectual property, or open domain/easily stolen data no longer maintained for government and compliance requirements.

One of the principles of being data ready is getting ahead of dark data. Good risk management is forward looking and future proofing; this means organizations need to be able to address and update what their risk profiles look like on the fly. This can only be accomplished with effective, intelligent data management strategies already in place. Data management has a ripple effect across all data.

Commvault’s data insights capabilities, including Commvault File Storage Optimization, Commvault Enterprise Data protection and Commvault Sensitive Data Governance, can go a long way towards solving the dark data problem.

Stay tuned for my next post, which will cover managing prescriptively across the data lifecycle.