Can you really trust your data?
By Gary Allemann, Managing Director of Master Data Management
The ability to analyse data and derive insight from it is undeniably essential for any business today. However, when data is moved through various systems and manipulated in order to provide meaningful reports, there is also the potential for doubt to creep in. Executives cite a lack of trust as a primary inhibitor to using data to drive decisions.
Understanding where data comes from and ensuring you can trust that it is accurate is essential for the adoption of both analytics and Artificial Intelligence (AI) programmes.
Where is your data coming from and what happened on the way?
In order for data to be of any use to an organisation it needs to be extracted, analysed and reported on so that insight can be attained. Data therefore moves through an organisation in numerous ways from multiple systems.
Reporting often requires the application of calculations and aggregations, roll ups to averages and summaries, all of which can make subtle alterations to the data.
When data is manipulated in this manner it may no longer reflect the source, and as a result, conflicting insights can be obtained from ostensibly the same information.
The way that data flows through the enterprise and the purposes for which it is utilised can make it difficult to understand how conclusions were derived.
Today, some Extract, Transform and Load (ETL) solutions have built-in tools to trace this movement as well as any changes made to data – this ability is known as data lineage.
In practise, any organisation may have multiple ETL systems, each providing its own disparate view, as well as SQL (code) based transformation processes and even manual “fixes” adjusting data to deliver better results.
Trust in the data to trust in the insight
Without a complete understanding of the data, trust in the information is eroded. This is especially true when a potential conflict arises, with multiple answers to the same questions.
In these instances, it is difficult to place trust in the insight provided, and decision maker will often fall back on ‘gut feel’. This negates the entire purpose of performing data analysis in the first place.
It’s not just about lineage, it’s about traceability
Data lineage provides some insight, but on its own it often delivers an incomplete picture because the built-in tools within ETL systems cannot document all of the movements and manipulations made to data.
Business traceability has become almost more important than data lineage, because it is crucial to be able to obtain a complete view of data throughout its lifecycle.
In addition, while ETL tools generate a view of the processes built into their tools, this is frequently highly technical the thus difficult for business owners to understand. Business traceability offers a simplified, yet more comprehensive, view into where data has come from and what has happened to it during the process.
The right tools for the job
Organisations therefore need a tool that not only provides confirmation that data is being sourced from the right place but that an accurate representation of the source is also presented.
Here, automation is key, particularly when it comes to collating data movement across multiple ETL tools and data manipulation procedures.
Today, there are technologies, like MANTA Flow, available that harvest and consolidate various different data lineages into a single view to provide a complete, end-to-end picture of lineage, which in turn facilitates complete business traceability.
Ultimately insight is only valuable if the data it is derived from can be trusted to give an accurate picture. For many organisations this means they need to understand the source of their data and get a true picture of how it moves and changes through their organisation.
Automated tools can simplify this process and ensure you can trust data to provide accurate intelligence for enhanced decision-making capability.