The phrase “One version of the truth” has been used across all sectors as a succinct way of saying that data should be consistent without any ambiguity about which value to use by eliminating all the alternatives that might arise through inefficiency in the systems and processes used to collect and manage data. Like many expressions its purpose has been blurred over the years and it is not unusual for business representatives, data architects, and data managers to raise it instinctively if anyone should so much as mention the words “data principle”. Over a number of years I have ranged from liking the expression, to resisting it and now am certainly inclined to question it. What makes me question it?
When you break the phrase down into its parts the first challenge is “one version”. A version implies that it may change over time and only one is allowed. So, all the rich history of what we thought was true at the time is not permitted?
“Version of the truth” is also somewhat challenging. The truth is the truth, it doesn’t have versions – it just is! The truth may change over time but if it is correctly recorded then at any instance in time there is only one truth.
Why does any of this matter? The first thing is it is important that organisations remember that their databases don’t store “the truth” but instead store someone or something’s perception of the truth. The moment you accept this it is worth considering why you might only want “one version of the perception of the truth”? For example, in a large retailer daily sales can be derived from what passes through the Point of Sale systems in stores. However, it can also be derived from the cash and card transactions, ie takings. When the raw data sets are consolidated through different systems (Supply chain and Finance) and then arrive at the boardroom as different figures, which one do you accept? When making decisions, should you only use one of them, and if so which one? Or might it be better to maintain both and try to determine why they are different and then eliminate the sources of error?
No-one would deny that having one version makes it easier to make decisions but to make good decisions requires accurate data. One version that is wrong is worse than having two versions (right or wrong!) since it will be used as the truth despite being wrong. With two versions you have to investigate further and while that might be tedious and expensive it at least adds value compared to blindly accepting one version of the truth!
So if there is to be an overriding principle for data perhaps it might be “when there is one correct answer we should know what it is” – not quite as catchy (I am a data person not a marketing one!) but perhaps rather more useful?