The Open Business Data Lake Standard, Part IV

Publish date:

A reference architecture describing standards that help organizations set up an “insights-driven” strategy.

In my previous blog posts (Part I,  Part II and Part III) about the ‘Open Business Data Lake Conceptual Framework (O-BDL), I introduced its background, concept, characteristics and platform capabilities. In this fourth part I want to compare a Data Lake with other data processing platforms.

Due to its characteristics, a Data Lake is a special type of processing platform. This can best be shown by comparing it with the following existing platforms:

  • Data Federation (ETL)
  • Enterprise Service Bus (ESB)
  • High-Performance Computing (HPC)

Data Federation (ETL)
An O-BDL is not a data federation processing platform. While data federation tools are able to cross- join data from multiple sources, normally those tools are IT-driven and managed, and they lack the near real-time analytic processing power and agility needed by the users.

Enterprise Service Bus (ESB)
An O-BDL is It is not a new version of the Enterprise Service Bus (ESB). While some ESB vendors have been touting near real-time data analytics (i.e., Complex Event Processing, or CEP) for years, again, those are centrally managed by IT, and most of the deeper analytic needs require data-at-rest analysis as well, not just data-in-motion analytics.

High Performance Computing (HPC)
An O-BDL is not a High-Performance Computing (HPC) platform. An O-BDL relies on different architecture principles and software frameworks. While in HPC environments data is moved to a large “super-computing” facility, in an O-BDL processing is distributed and sent where pieces of data are stored.

While an O-BDL platform is different from the three mentioned, it can be combined perfectly by sitting on top of either of them, abstracting away the problem of performance and working with disparate data sources and targets. Data federation platforms may also be used as a method to create simplifying views of the data stored in an O-BDL for business users. Finally, the same physical infrastructures (clusters) could be used as a HPC environment or an O-BDL.

In the fifth blog in this series I’ll discuss the key concepts of an O-BDL and describes how it should work.

Related Posts

Data Analytics

Monitor data proactively to increase model resilience

Chandrasekhar Balasubramanyam
Date icon July 22, 2020

A new approach can keep your predictive models a step ahead – even during a crisis such as a...

Insights and Data

Retain value, decommission cost and manage your business’ COVID response

Steve Jones
Date icon July 20, 2020

Data sharing of employees and facilities information will ultimately help transform how...

Insights and Data

3 questions you must ask in your business’ COVID-19 response

Steve Jones
Date icon July 20, 2020

Sharing of data between the solutions is enabling companies to adapt with the virus and make...