With contributions from Luc Ducrocq
Organizations spend a great deal of time, money, and human capital developing a big data program that involves implementing data management solutions and ironing out the integration processes needed to tie it all together. The spend on this has and will continue to grow exponentially as organizations begin to tackle their Big Data challenges around weaving together structured, semi-structured, and unstructured information. However, with the new data structures they must deal with, and the lack of overall quality information, implementation of data management solutions and the trust in the resulting insights they provide are on shaky ground.
Here is a short list of large concerns to be aware of when implementing a big data program:
Not including Data Governance at the very beginning of the program. Data governance is the process of creating and agreeing to standards and requirements for the collection, identification, storage and use of data. It should not be viewed as optional when undertaking any data driven project. Data governance must include structured, semi-structured data, unstructured data, registries, taxonomies and ontologies as it contributes heavily to organizational success through repeatable and compliant practices. It is critical to consider all types of data, especially today when we have Twitter and Facebook, which most likely will come in all forms of unstructured and semi-structured formats. Guidelines from governance need to address all of these types of new data requirements that must be considered as part of any new program. It, therefore, must be addressed at the outset of any large scale technology project to ensure that the resulting insights can be trusted to help the organization achieve value from the investment being made.
Ignore data quality and data relevancy and people skills. Running your big data programwithout a data governance program running concurrently is a recipe for disaster. A data governance program will be required to help you understand which data is relevant and which should be removed. It is also imperative to understand the level of accuracy, consistency and applicability of the data. A governance program will also help you identify the required skills you will need around data literacy for manipulating, managing, and interpreting data with not just numbers but also text and images. Types of skills needed include data architecture skills and knowledge of new data types and how to use, manage and analyze them. The analysis skills are also different when you consider the various ways new data enters an organization today.
Being unprepared for change. The structural and process changes that will likely occur with a Big Data effort need to be managed very carefully. Structural and process changes include new ways to capture and store these Big Data requirements and new ways to analyze and build analytics against these new data sources along with the data you’ll receive through the program.Strategically, the added efficiency should do more than speed up existing processes. Efficiency should be married to and go hand in hand with data governance. Managing data as a corporate asset is usually accomplished today by discrete data management organizations across companies. If your company doesn’t have an effective data management organization in place, adoption of Big Data technology will be a huge challenge.
Excluding consultants. When looking at a consultant’s role in the Big Data arena, one should look no further than the business intelligence resources most large systems integrators employ. After all, Big Data issues are not new; they are simply more complex than they were in decades past. Customers have always had to address large and expanding data sets and find ways to gain better insights from their data (turning data into information and then morphing that into actionable intelligence). What has changed is the type of data being harnessed to achieve the end goal – actionable intelligence. With the shear volume of unstructured and semi-structured data now available, it is the type of data not the core skills or techniques that has changed in recent years. Consulting professionals still need to do what we have always done best to address “Big Data” – identify, acquire, organize, cleanse, store, and analyze.
An effective consultant’s approach takes a holistic view of master data management – addressing companies’ challenges with a sound enterprise master data management strategy and roadmap.The approach funnels everything into a flow of data that’s easy to govern and checks with key performance indicators, enabling collaboration inside and outside your firewall. An effectivemaster data management and governance methodology will make sure everyone knows “who” governs “what,” what is “what,” and “how” the data is used across the value chain.
Your data governance may not be that strong to start with. Big Data will make it worse. Big Data Governance requires performing governance over many different types of data, not just what’s in relational databases. The scope obviously needs to include non-relational databases and unstructured data and documents, which in itself will require new tools to deal with these other technologies. Assessing, profiling and managing a larger volume of unstructured datawithout strong data management in place becomes unfeasible.
When you are working with unstructured data, especially social media type data, there is a wide variance of opinion amongst the C-suite on the usefulness and actionable nature of this information. Is this information something that I can actually take advantage of and make decisions on without delving further beyond the acquisition of this unstructured data? Social media data is also most useful immediately, at the moment of the tweet or the blog update or the Facebook update, when it is current and fresh to the topic at hand. And it can become so voluminous in nature very quickly, that standards and rules must be applied in a data governance program before you can effectively use the volume of information at hand. The governance program should address questions of what to capture and how often, and how to deal with it from a reporting and actionable perspective once it is captured and stored.
In conclusion, because of the variance in opinions at the highest level of management on the acquisition and use of this type of data, it is critical to gauge the importance and usefulness of the data through a data governance program. The data governance program must also be able to address how to show this data in a dashboard type format to executives on laptops, phones, i-pads and other mobile devices so that the information can be disseminated quickly, efficiently, effectively, and most important, TIMELY, to the “powers” throughout the organization. Governance should drive the ability to create and disseminate this information to executives worldwide in a manner and timeliness for decisive action by the right people.