Semantic: The next good thing you do with data


Posted on: February 24, 2015 by Thierry Caminel

In a recent article[1], Gartner master data management (MDM) expert Andrew White asked the question “Why MDM might be the last good thing you do with data”, and explained that “we need to focus less on data and application integration, and less on API design, and focus first more on semantic exposure for interoperability. Once we have a shared language, the effort and methods of data exchange between hubs and applications will become simpler”.

Atos - Semantic: The next good thing you do with data

One goal of Semantic and Linked Data technologies, as presented in the recent Atos analysis “Journey 2018”, is to provide such shared language and methods to expose the semantic of data and reduce data exchange burden. As it becomes more and more mature, these technologies might be a credible alternative to standard MDM approaches.

Let’s take a very simple example where we need to integrate 2 people databases, one using a field called “last_name” and the other “FamilyName”. One solution could be to select one term in a ‘reference’ dataset, or use a third term such as “lastName”, but it’s not satisfactory because we don’t have a clear description of the semantic of the field. A better approach is to use a term defined in a kind of “shared language” called an ontology, like the FOAF vocabulary that provides a well-tested semantic model to define people and organization. This term, uniquely referenced by a URI[2] (Uniform Resource Identifier), can replace the original terms, or we can write a rule that says that these terms are equivalent, and use an inference engine to automate the mapping between these terms. The same kind of process can be performed to align the original database models to a more general one, better semantically defined and based on a graph structure. The process can be incremental, as information from other data sources can continuously be integrated by creating similarities links between terms, and abstracting common concepts into highest level enterprise ontologies. Progressively we can build that enterprise “shared language” that makes “data exchange between hubs and applications become simpler”.

Recent standardized technologies ease such approach of data management. One of the most interesting is JSON-LD, an extension of the popular JSON data format where each field in a message is mapped to a concept in an ontology, and that enforce hyperlinking of resources using HTTP. The standard has been designed to allow a smooth transition from existing JSON API, enabling developers to smoothly add Linked Data principles in their API.

Semantic Web techniques have been already proven successfully to integrate complex and heterogeneous datasets like the ones found in Healthcare. It is likely that the emergence of these new standards will significantly reduce the gap between information and knowledge in many other sectors. It could be the next good thing you do with data.

[1] Andrew White (2014), “Why MDM might be the last good thing you do with data” - http://gtnr.it/19oxH7B

[2] Semantic Web terms are represented by Web URI. Here http://xmlns.com/foaf/spec/#term_lastName

Share this blog article


About Thierry Caminel

Senior technical architect and business development manager and member of the Scientific Community
Thierry Caminel is senior technical architect and business development manager at Atos, and member of the Scientific Community. His main focus areas are in distributed systems, context aware computing, Cloud and event driven architecture. Before joining Atos he worked in several startups innovating in the field of Artificial Intelligence, Web 2.0 and Machine to Machine, and lead embedded systems development for space experiments. Thierry holds a software engineering degree, and lives in Toulouse.

Follow or contact Thierry