I like to follow Ray Wang posts. I have known Ray quite a while and well understand why he is thought of as the ‘analysts’ analyst’, his views are well worth reading and usually make me thoughtful. As I remain convinced that we are heading into an era when we will have to think seriously about data, in every sense from formats, to providence, ontology to tagging, and indeed how we use data individually and corporately I was immediately interested to see Ray’s post; Research Report; Rethink your next generation Business Intelligence Strategy. As I expected it was insightful, and pinpointed four key issues around how what Ray calls ‘The Information Management Matrix’ drives next generation business intelligence. I cannot recommend highly enough reading his views carefully and using them to consider your options.
As I expected Ray got me thinking and comparing other views and experiences about what some are calling ‘Big Data’, meaning we are moving into a period when the sheer volume of data that will need to be ‘handled’ within an enterprise will become a major strategic and operational issue. This is an extension of the original meaning of the term in connection with software engineering and data sets that have become too large to be used conventionally. EMC, in particular, (as well as other vendors), like the term and have views on it for obvious reasons given their business starts with the storage of data and goes through many permutations of everything else associated with it. However it was a post by another long time hero of mine that really hit home. You can’t be much more famous than Tim O’Reilly when it comes to the Web and using it to create new ways of working, after all this is the man who gave us the term and the definition for Web 2.0!

Back in May Tim posted a look at the creation of the USA National Health Information Network, HHIN, work on Open Health Care Records and Government as a Platform as a part of his focus on Government 2.0. At the time I read it in connection with Gov 2.0 and there was much about the piece that I found interesting so it stayed in my mind. The link to Ray’s piece and Big Data is there at the heart of Tim’s post which opens by stating that the goal is not to build a massive centralised database for health records, but instead to take a different path. Tim identifies the goal of the USA Office of the National Co-ordinator, ONC, to define the rules of the road for the interchange of patient records.
He goes on to say that in true Internet style the expectation is that common protocols and file formats will allow vendors to compete on a level playing field to build the actual applications (to which I would add ‘to exploit the data’). The rest of the post outlines the approach that will be taken to define and achieve these common elements. Tim’s argument for a long time has been that government needs to provide a platform in this way to allow others, private individuals and enterprises to enrich it with their own additions. As you may know from other posts of mine I am equally convinced that the key to many aspects of using clouds and/or web services successfully are platforms, or perhaps I should say PaaS, Platform as a Service, to be consistent with the new environment of clouds and web services.
There is a big question hanging over this. If we are bringing together these disparate apps or services, then we must be doing the same for data, ie the Big Data concept, but how do we then navigate this data to give Ray’s next generation BI, and Tim’s Government as a Platform, the capability to give to every one the data they need within the context they need it in? This brings me to the six page post that really has been hanging around in my mind for the last few weeks with the simple title of ‘Citizen Dan goes Live; available for Download’. Citizen Dan is a Semantic Framework claimed, by Structured Dynamics, to be an Open Source system available free to any community which allows multiple diverse data feeds to be navigated, sliced and diced, supplied as feeds to portals, etc. Some of the examples include The Web itself, census data, real-time feeds, government data sets, crowd sourced, all in either structured, or unstructured, forms.
Okay it’s only the first release so its early days, but this seems to start to move the concept of semantic data being able to enable Big Data, and support platforms, towards reality. Even better there is the capability to download and experiment. So if you are interested to take this further there is a six page description of the elements and their use, plus a downloadable example. A further blog from Fred of Structured Dynamics gives even more material and details of how to get involved with the community effort.
Suddenly it seems as if the pieces are coming together when I see Ray’s definition of where BI must and will move to as part of coping with the Big Data environment, Tim’s definition of creating Platforms to enable data interchange highways, and finally Structured Dynamics Citizen Dan delivering semantic enablement to tie both directions together.