Beware the Blind Spots: Gaining a Clear Line of Sight with Data Lineage

Feb 21st, 2023

Gary Allemann – MD at Master Data Management

We all have blind spots – in our personal lives, jobs, cars (though backup cameras help with that last one!). These blind spots impact how we see, navigate, and interact with the world. Often, we don’t even realise we have blind spots or what we’re missing until someone (or something) brings them to our attention. When blind spots lead us to make mistakes or cause an accident, we instantly become aware of them and do our best to fix them. Nobody wants to repeat the same mistake in the future. However, when it comes to blind spots within data systems, there’s much less leeway and far greater complexity. Data environments have become highly complex, which creates blind spots. These blind spots create problems – for users, management, the enterprise and business.

How Complexity Creates Data Blind Spots

When it comes to data blind spots, one way to think of them is as “unknown unknowns”—factors an organisation didn’t take into account because they simply didn’t know that they needed to or even that they were there. One reason for this is because organisations have disparate sources of data.

Any given enterprise IT landscape is rife with directly and indirectly connected applications, microservices, and infrastructure, split across public clouds, private clouds, and on-premises servers, with countless dependencies defining how all these touchpoints interact with one another.

When an organisation has data from multiple sources and systems, they’re dealing with what some refer to as “messy data”—multiple data sets following different logic or structure that needs to be “transformed” to ensure they are all speaking the same language.

This “messy data” is one of the main indicators of data complexity, and when an organisation is dealing with this level of data complexity, it can be difficult for any one person or team to have complete visibility over it all. You have a blind spot. In fact, you likely have multiple blind spots (or unknowns) hiding more blind spots. It’s nearly impossible to find and rectify them all. Just like you’re unaware of your blind spots when driving until you have an accident (or near accident), data blind spots can manifest themselves in a similar way. This may include a broken dependency that causes an application to stop working; an unseen security vulnerability that results in a privacy breach; or a software update that triggers a service failure.

The issues hide until they’ve created problems for the end-users – and depending on the nature of the problem, the impact could range from frustrated customers to a public embarrassment.

In managing these complex data environments, not only is your data team driving through multiple blind spots, but they’re now also stuck in a traffic jam. Blindspot issues force them to trawl through the data environments at a stop-and-go pace, often manually, which slows down their work on everything else, such as:

  • The ability to roll out innovative new features and services.
  • The ability to update features and services already released to the public.
  • The ability to resolve issues that are negatively impacting user experiences.
  • The ability to identify the root causes of these issues and stop them from recurring.

So, not only is data complexity creating blind spots, but it can also leave an organisation’s IT teams unable, even paralysed, to find and fix the issues.

How Data Blind Spots Impact Enterprises

When these blind spot issues arise from data complexity, and IT teams can’t address the problems quickly, it harms user experiences, frustrates customers, damages brand reputation, and negatively affects business outcomes.

In fact, a recent survey of over 250 IT professionals and business decision makers found 93% of respondents indicating that data management complexity is impeding their company’s digital transformation.

In this age of complex data systems, no industry is immune. Enterprises across a wide swath of business sectors experience data blind spot issues and must deal with the impacts.

Data Consolidation within a Financial Firm

A US financial firm in the process of consolidating data from multiple subsidiaries and affiliates into one central finance and risk data lake overlooks a hard-to-find dependency, accidentally introducing a major bug that affects the primary data lake and reporting systems in all the integrated affiliates. The impact? The corporation experiences limited access to financial insights, hampering their ability to make informed, timely, and correct decisions. Full recovery takes more than seven months with damages estimated at over $60 million.

Multiple Acquisitions for a Healthcare Insurance Company

A healthcare insurance provider undergoes a series of acquisitions, with every acquisition posing a challenge to the IT team, as the team must integrate and consolidate data environments, reconnect all data pipelines, and properly decommission unused parts of the acquired infrastructure. It’s a situation rife with data blind spots and traffic jams for the team. The impact? Sensitive data is left in so-called “dead tables,” with data pipes connected upstream but not downstream.

Migration Project within a Global Tech Company

A global technology firm begins a migration project from their existing Teradata infrastructure to Snowflake. In the planning phase, several indirect dependencies go unseen and are overlooked, so when the migration begins, new problems and broken links (primarily between reports and the data layer) arise daily. The impact? The migration project is restarted several times before being abandoned completely at a cost of over $20 million.

Find Your Blind Spots with Data Lineage

As you can see, data blind spots can lead to frustration, financial cost, and trust and reputational loss. So how can an organisation see through the data blind spots in their systems to avoid these kinds of outcomes? Automated data lineage can help them conduct blind spot analysis to proactively identify and resolve blind spot issues.

Automated data lineage takes the complexity out of your environment by equipping your IT team to fully control their data environments, eliminating the risk of overlooking or accidentally breaking something or wasting time verifying root causes. Eliminating blind spots means greater productivity, greater efficiency, and no unwanted surprises.