The Open Business Data Lake Standard, Part IV

Publish date:

A reference architecture describing standards that help organizations set up an “insights-driven” strategy.

In my previous blog posts (Part I,  Part II and Part III) about the ‘Open Business Data Lake Conceptual Framework (O-BDL), I introduced its background, concept, characteristics and platform capabilities. In this fourth part I want to compare a Data Lake with other data processing platforms.

Due to its characteristics, a Data Lake is a special type of processing platform. This can best be shown by comparing it with the following existing platforms:

  • Data Federation (ETL)
  • Enterprise Service Bus (ESB)
  • High-Performance Computing (HPC)

Data Federation (ETL)
An O-BDL is not a data federation processing platform. While data federation tools are able to cross- join data from multiple sources, normally those tools are IT-driven and managed, and they lack the near real-time analytic processing power and agility needed by the users.

Enterprise Service Bus (ESB)
An O-BDL is It is not a new version of the Enterprise Service Bus (ESB). While some ESB vendors have been touting near real-time data analytics (i.e., Complex Event Processing, or CEP) for years, again, those are centrally managed by IT, and most of the deeper analytic needs require data-at-rest analysis as well, not just data-in-motion analytics.

High Performance Computing (HPC)
An O-BDL is not a High-Performance Computing (HPC) platform. An O-BDL relies on different architecture principles and software frameworks. While in HPC environments data is moved to a large “super-computing” facility, in an O-BDL processing is distributed and sent where pieces of data are stored.

While an O-BDL platform is different from the three mentioned, it can be combined perfectly by sitting on top of either of them, abstracting away the problem of performance and working with disparate data sources and targets. Data federation platforms may also be used as a method to create simplifying views of the data stored in an O-BDL for business users. Finally, the same physical infrastructures (clusters) could be used as a HPC environment or an O-BDL.

In the fifth blog in this series I’ll discuss the key concepts of an O-BDL and describes how it should work.

Related Posts

AI

How to find the path to a data-powered enterprise

Dinand Tinholt
Date icon November 18, 2020

Companies are still struggling to activate data and artificial intelligence.

Business Data Lake

Why data needs to be hands-free

Danny Centen
Date icon October 14, 2020

Data is being democratized, extending the audience for business analytics by sharing access...

Business Data Lake

Enjoying your data lake from your lakehouse

Fiona Critchley
Date icon October 13, 2020

When it comes to data, businesses have trust issues.