Skip to Content

Why you need a trusted data pipeline

Dan Sherman
May 29, 2020

The definition of business value is different from sector to sector and many times, from company to company, within a given sector. However, the one constant in returning value is the ability to make the right decision at the right time, ahead of your competition. An inability to utilize your data assets as a competitive differentiator, puts you at a clear disadvantage in the market. A great example of this in today’s market is the desire for manufacturers to find new ways to market and sell directly to their end consumers. For any company to disintermediate, they must be able to accelerate their decision making within product development, marketing, and most importantly, their supply chain. To accomplish this transformation, these producers must evolve into a data-driven organization that makes decisions based on unbiased outputs.

For years, companies have strived to improve this decision-making process through maximizing the use of their data assets. The explosion of potential data sources to be utilized has never been to the size or scale that it is today, and the list is only growing. While many of these same companies have done their mighty best to coral this data in a way that makes it consumable, the challenge of maintaining trust in the outputs continues to escalate. This loss of trust is heavily centered in the consistent trend to centralize all data into a data lake or data warehouse, so that consumers of the insights have one place to go for their data needs. Instead, companies need to think in terms of a data pipeline vs a central data repository.

A strong, trustworthy data pipeline that enables acceleration in decision making is vital to developing a data-driven culture. While there is still tangible value in the use of a data lake for specific types or uses of data, to suggest that all data should be centrally housed is not reasonable, particularly when speed, accuracy, and trust are the priorities. As soon as you move data from one location to another, you increase both time to delivery and the risk for error. This is not simply a matter of human error in transformation of the data, but moreover an opportunity to introduce bias into the equation. When you move data from its source, vs consuming it from its source, someone must decide what data is relevant to the business and decision-making process.

A result of too much data movement is that the output of a significant insight may be different from one department to the next, creating multiple versions of the truth. These multiple versions quickly erode business value as people move away from making decisions and instead work to defend the basis for a decision. Additionally, when data movement becomes the “norm,” the cost of your data ecosystem exponentially increases, and data security is diminished. For most companies, creating an effective data pipeline requires a change of culture and process, as well as skill sets and technology, to allow for consistency, speed and one version of the truth.

This concept of limiting data movement to accelerate change is a core differentiator within Capgemini’s Renewable Insights offering. The ability to innovate and transform through disruption is the definition of being Renewable in the world of data and analytics.