CTO Blog

CTO Blog

Opinions expressed on this blog reflect the writer’s views and not the position of the Capgemini Group

Big data is not a voluntary move, it’s happening to you NOW

Category :

It seems as though big data is the latest candidate for the hype cycle, and even though I reacted to a number of articles and white papers a month or so back, it appears to be back on the agenda again. The triggers this time are a press interview in which I was asked to give my views, and the publishing of a very good – meaning very practical account of what to consider and focus on getting right – white paper by a colleague, Steve Jones.

But then you are not going to adopt big data for a while yet are you? So no need to read this just yet! Wrong!! The point about big data is that it’s an environmental shift that is taking place all around us by virtue of the shift in what people are doing online, with what devices etc. My point in the previous post ‘Big data or is the accumulation of small data the real issue?’ was that big data is made up of the huge amount of small data that is being handled and accumulated by the way that we are increasingly using a rich variety of sources other than enterprise applications.

There are two lines of thought around at the moment and it is necessary to separate  them in order to clarify my title. The first is merely to see that technology change in the form of more powerful systems and tools allows a level of analysis of business information not previously possible. That’s certainly true, but this approach remains focused on the use of structured data, and relational data bases, in other words the trusted data that we have today internally captured and processed, and most important of all categorized in order to be able to make use of it.

Okay, so most of the comments on this topic do include ‘unstructured’ data in their headings, but the reality is that the overall approach is still defined by the need to deal with some form of categorization in order to make sense of any form of analysis. My colleague Steve Jones has just written an excellent paper on this topic called 'Master Data Management Mastering the Information Ocean' which offers some really practical approaches to making a workable solution. But this is still big data = big analytics empowered by cloud or grid technology making computational power to do this available at a sensible price.

It’s the other side to big data that is the game changer; the sheer volume of interactions and accumulations by users with all kinds of data in an online world. This is the real unstructured material that is ‘happening’ in every enterprise today, creating not just a headache on the provision of storage, but a nightmare of untrusted data in circulation, or could it just be the new face of Business Intelligence? It’s the ability to react to real-time external market information in real-time to optimize a response, instead of the current after-the-event big data analysis to see how well it all turned out.

As with many of these new approaches an example makes it easier to understand; a sales person sees from a news item and a link to a competitor’s Web site that they will be holding a promotion in the local area for the next month, and accordingly after collaboration with their manager they change their pricing in a key account where the competitor is active to ensure that they do not lose business. That’s what real-time decision support tied to social collaboration tools is all about, the de-centralization of business models in order to maximize and optimize local conditions.

However, if this activity was captured and used badly in a ‘big data analytics model’ it would be an example of untrusted data contaminating the outcome. Of course it would be possible to ‘recognize’ this data and categorize it into a model to see what would happen if this competitor behaved the same way at a country, or even global level. However, by the time this sort of modeling was carried out in all probability if the competitor wanted to do this it would have happened!!! Again, BI after the event!

What this defines is the need for a new category of data which I call ‘trusted in context’, i.e. in the context of the customer account in question, at that time, with that competitor the data could be trusted enough to act upon. It’s happening right now in most enterprises, and sometimes coupled with tools like Salesforce.com chatter, all of which may, or may not, be an enterprise provisioned capability. That’s the game change which big data brings; the available data to support local operational decision making within a limited context is as big as all the available data reachable by a search engine on the Web.

The question is, do you realize that this is happening and what should you be doing with providing guidance and policies about how such data is used, stored, and most of all controlled, in its acceptance as enterprise trusted data? Here is a link to a good site where a discussion under the title of ‘Unstructured Data: The Elephant in the Big Data Room’ is running.

About the author

Andy Mulholland
Andy Mulholland
Capgemini Global Chief Technology Officer until his retirement in 2012, Andy was a member of the Capgemini Group management board and advised on all aspects of technology-driven market changes, together with being a member of the Policy Board for the British Computer Society. Andy is the author of many white papers, and the co-author three books that have charted the current changes in technology and its use by business starting in 2006 with ‘Mashup Corporations’ detailing how enterprises could make use of Web 2.0 to develop new go to market propositions. This was followed in May 2008 by Mesh Collaboration focussing on the impact of Web 2.0 on the enterprise front office and its working techniques, then in 2010 “Enterprise Cloud Computing: A Strategy Guide for Business and Technology leaders” co-authored with well-known academic Peter Fingar and one of the leading authorities on business process, John Pyke. The book describes the wider business implications of Cloud Computing with the promise of on-demand business innovation. It looks at how businesses trade differently on the web using mash-ups but also the challenges in managing more frequent change through social tools, and what happens when cloud comes into play in fully fledged operations. Andy was voted one of the top 25 most influential CTOs in the world in 2009 by InfoWorld and is grateful to readers of Computing Weekly who voted the Capgemini CTOblog the best Blog for Business Managers and CIOs each year for the last three years.

Leave a comment

Your email address will not be published. Required fields are marked *.