Fred Perry – the last British man to hold the Wimbledon trophy aloft, 73 years ago. With the nation crying out for its champion and Andy Murray professing to have already won this year’s tournament (albeit on his Playstation 3), we thought that this week we would delve into the world of tennis and try our hand at predicting the outcome of 2009s battle for glory on the hallowed turf of SW19. Who will win?
In some circles it is expected that statisticians and analysts should know the answer to this question. After all, we should be able to use the wealth of data available on previous performances to predict who will claim glory and who will crash out in the 1st round with 100% certainty right? Wrong! (We’d be summering with likes of Roman Abramovich and Richard Branson if we could!) There are a plethora of factors that can affect the outcome of a match leading to a centre court final and victory…training, advancements in equipment technology, injury and even the dreaded weather can take its toll! There is always an element of risk attached to relying on historical data to predict the outcome of future events.
Could a model have predicted Gisela Dulko to slam the wounded superstar, Maria Sharapova, out of Wimbledon in the 2nd round? Answer – Possibly…depending on the data you fed it. Our own common sense tells us that in her prime Sharapova would most likely have seen Dulko off and advanced with the other seeded superstars to the next round. However, based on the Russian’s shoulder injury, the chances of Dulko beating her increased.
So with this risk in mind, we looked for data that might help us find the next Wimbledon champion. Thanks to ATP Tour Match facts we were able to extract data on Service Games Won % per player.
Now I know what you are thinking – based on the data we have the player with the highest service games won % should therefore win Wimbledon right? Right…most of the time. Let me explain. Our model simulated the Wimbledon tournament 200 times and Andy Roddick, who has the highest percentage of 91% service games won, did indeed win the most number of times. However, other players did have their own blessed days and the results were as follows:
So what does this mean? Our model says that Roddick is 43% likely to win outright whereas the professionals think 4.8% – a lot less faith in him! Similarly, our model says that Murray is 1% likely to win whereas the pros think closer to 33%. In the real world, Federer is firm favourite to win the tournament but our model predicts him ranked 3rd in line behind Roddick and Karlovic. In summary, if we have faith in the bookies, our model needs additional data to help predict safer odds of winning. What do they know that we don’t?!
There is also the risk of the unexpected. Even though our model says that Roddick is much more likely to win Wimbledon than Juan Martin del Potro, if they were to meet in round four, Roddick could be tired from winning tougher matches in the draw. He could have picked up an injury or even just have a bad day resulting in a loss. Likewise, in round 1, Pablo Cuevas (ranked 125th in the world) beat Christophe Rochus (60th in the world) on a tie break in the last set even though our model had a likelihood of 97% that Rochus would win.
So what does this all tell us? Past performance is not a guarantee of future returns. Good decisions are born from assessing and managing the risk involved. If tournaments truly are won by having a high service games won % then there is a good chance Roddick will win. However, my money is staying firmly in my wallet because we all know that tennis is so much more than that.