The quest for data - now where did I put it again?

Posted on: March 27, 2018 by

This is the 5th article in a series of 5 publications on organizations’ Data Analytics strategy. The series, called ‘Moving beyond point solutions and pilots in Data Analytics’ addresses five challenges. These are challenges that organizations most experience when they want to upgrade their Data Analytics from point solutions to strategic, data driven and coherent governed activities.

This fifth article explains how Data Analytics depends on data usage. You should make sure your data is top notch in enabling futureproof Data Analytics. In order to use your data, you need to be able to find it. This might seem odd, but appears to be a real challenger for some businesses. Data is all over the place; in all shapes, formats and sizes. It tends to outgrow your organization that is why you want to manage your data: tracking it and being able to find it back. Managing your data stimulates effective usage. Managing your data at metadata level provides a way to beat your competition. Here’s why:

Being able to find your data is an ever growing challenge these days: It is waiting there for you, finding it is the big challenge. As we gather more and more data, we see our clients struggling more and more in finding back their data collections. How efficient and effective is your big data strategy when you are not able to retrieve your data, what data formats to use nor how to re-use your data. And believe us, you don’t want to start over with collecting your data again and again, time after time.

In managing your data, three levels are particularly relevant to govern:

  • Organizational level: address ‘owners’ of the data, making them responsible for data quality.
  • Architectural / operational level: decide what data to use, where to find it, in order to quick start your effective data usage.
  • Metadata level: make data able to be found and make sure you determine the meaning and possible purposes of the data, enabling you to use its full potential.

What should my metadata management contain?

The solution to the enlarging problem of not being able to find back data is: Metadata management. Metadata management is a description of your data including where to find what, how to document certain formats and how to save sizes effectively. Attached to your metadata, is a business glossary which explains technical terms. It tells you in what table, column and field to find the data you are looking for, connecting your algorithms to your data. Next to this, it contains technical agreements of saving your data, e.g. postal codes that are either saved as ‘1234 AB’ or ‘1234AB’ where special attention is needed for the meaning of your data.

Current developments make metadata management even more necessary

Current developments in regulation make it necessary to also document terms of keeping data, retention policy and privacy aspects like user authorization. Something to be aware of.

Setting up a metadata management level prevents you from starting over again and again.

All of the advocated above applies to your classic, structured data management, but even more to your unstructured data. An insight some of your competitors might not have discovered yet.

In metadata management, the essence is to retrieve all collected data. Being able to find back the data that you have saved can be a challenge on its own. With our clients, we often see several repositories – containing different formats, sizes and shapes of data. In order to enlarge the usability of this data, and to be able to locate it, business glossaries, descriptions of data repositories and metadata management tools are implemented. All to serve the purpose of collecting the right data, at the right time.

The struggle with business glossaries and descriptions of data repositories is the effort vs. the benefits: Setting up these records is a terrifying job. A very costly one as well, think of 10 pax for one year – as we have come across with some of our clients. The risks are high as well: not maintaining a business glossary deteriorates the value of your data quickly. Deteriorated data will become data that is not used. Now that’s a waste!

Luckily, there are two options to simplify your life:

  1. Only make a business glossary for those repositories that are most important (Agile-style)
  2. Artificial intelligence

Artificializing your data

Artificial intelligence can help by reading all data repositories and business glossaries. Artificial intelligence is able to link specific data-fields, connecting databases and data repositories - competing with human calculations. By applying artificial intelligence, both a top down approach ‘are the suggestions made by AI correct?’, as well as a bottom-up approach, ‘looking at the data, what does AI suggest as next-best-action’, generates insights, making results quickly available. Companies that apply AI come up with results that beat their competition in response times, enabling themselves to make more effective use of their metadata management. Something to consider, as the future is evolving towards AI: Companies with an adequate application of their metadata management beat their competitors.

We see this same trend in the market: software suppliers that used to deliver ETL are now specializing on AI to strengthen their propositions.

That’s the winning deal.

The research for this article, which is a series of 5 with the collective title 'Moving beyond point solutions and pilots in Data Analytics', has been done by my team at Atos Consulting. So the thought of the gentle courtesy goes out to Tom Konings, David van Steen, Marcel van de Pol and Carline Nauta.

Share this blog article