In ever decreasing windows of opportunity for intelligent decision-making within modern businesses, the proliferation and exponential expansion of data is driving a re-think in terms of effective management information strategy, technology and approach.
Big Data has arrived but are we ready to deliver?
The era of Big data may well have arrived, bringing with it a plethora of new hardware and software accelerated approaches which process information faster and wider than ever before; Of course appliances, hadoop and massively parallel processing should be integral to our future but, instead of solely looking at the issue in terms of increasing horsepower like the proverbial ‘IT boy-racers’, wouldn’t a focused ‘tuning’ exercise be more appropriate?
Technology innovation within the confines of Moore’s law will continue to evolve yet, in the same way that mobile device evolution is still hampered by the lagging capabilities of battery power, information management architectures despite their burgeoning processing speed and capabilities still have fundamental challenges dealing with the context of ambiguous data.
In this sense, I believe that processing the right data rather than big data should be the focus of today’s information-centric organisation.
So, how do we select the right data for a given business question or context?
Well in my opinion an information management strategy needs to implement the following steps:
- Develop a common information model for core data that spans lines-of-business within your organisation (No more than 25 information objects!)
- Clearly delineate ownership in terms of business area, business process and underlying applications that interact with and control each core information object
- Now prioritise the order of core data objects in business-terms you wish to address:
- Develop a basic taxonomy and ontology for each information object including ‘synonyms’ where appropriate (i.e client is a customer, prospect is not a customer and so on. Yes, we went with customer first)
- Establish a data quality threshold against the 7 axis of data quality above which data is deemed to be acceptable for business consumption.
- Implement data governance policies that focus on ensuring the above quality thresholds are maintained.
- Develop data quality routines and master data management processing to support at point of entry and on synchronisation points across your application portfolio.
- Continue to cycle through these steps persistently improving the quality and context of the information at hand
That translates to significant effort as you can see.
So, how do we accelerate Big Data implementations?
The emerging field of Data Discovery technology seems to be the potential solution.
Furthermore, they create cross-application data models by looking not only at data structures but also at data content to determine the nature and context of the information at hand. They also have the ability to derive business rules and logic locked away in the application and data code itself.
These technologies are now being integrated into mainstream data integration life-cycle frameworks and act as a pre-cursor to Data Profiling, ETL, data quality and MDM components on the same metadata foundation yet, most institutions do not perform a discovery phase prior to delivering an integration initiative.
I believe that the utilisation of discovery technology at the outset of your data initiative to plot the most appropriate information path across existing landscape whilst providing an understanding of the structure, content and context of the data, will minimise the need to persistently process too much ‘Big Data’.
After all, if with appropriate context you can reduce the complexity and quantity of data necessary to inject intelligence into your organisation whilst gradually improving both corporate understanding and its underlying data quality, that’s a good thing isn’t it?