Stuart McDonald works for Entity Group as a Probabilistic Matching expert, focussed on using the Probabilistic Matching Engine (PME) in the IBM Master Data Management (MDM) software suite.

A very simple, practical example could be: Are these two addresses the same?

·         Entity House, 980 Cornforth Drive, Sittingbourne, United Kingdom.  ME9 8PX

·         980 Cornforth Dr, S’bourne, UK

A human eye (with a bit of help from Google Maps) will probably judge them to be the same, but how can you reliably codify this judgement so an algorithm can reach the same conclusion?  And how can you ensure this algorithm also works with any UK or worldwide address?

Stuart’s blog discusses the dependency Data Scientists have on data preparation and conformity, Master Data Management, including advanced techniques such as entity resolution using Probabilistic Matching.  As Data Scientists we tend to refer to these activities as ‘Data Munging’ and we all know how challenging and time consuming it can be!  Stuart’s blog starts to lift the lid on elements of this.

Entity Group is classed as a Small/Medium Enterprise (SME) and frequently partner with Capgemini since they provide niche expertise which fulfil a core component of any Data Science team.