Have you ever wondered how accurate and precise machine learning predictions can be these days? There are many competitions to test your skills, but the Data Science Game is unique among them: It is the biggest student competition worldwide, one could say it is the student world championship! The successful teams of finalists meet at the exclusive country chateau of the Capgemini University in Les Fontaines close to Paris, where they compete in a hackathon for first place, honour and prizes such as iPads and drones.

Do you think you could put together a team that has what it takes?

What is the Data Science Game 2017?

The Data Science Game is a machine learning competition for students (in teams of four) with the goal to achieve the best prediction by means of computer algorithms. The real life challenges come from various fields such as image recognition or predicting customer behaviour in insurance. Last year over 143 teams from 50 universities in 28 countries entered, with the best 20 teams progressing to the face to face finals in Paris.

The grand final
Last year’s final started on a sunny Friday afternoon in Paris where students were networking and socialising with other students and the sponsors, culminating in the challenge problem being set (though no data being provided). Afterwards a dinner was held in the picturesque Capgemini chateau at Les Fontaines, followed by a visit to the hotel bar.

On Saturday morning, the challenge was laid out in detail: a real case with data from an insurer containing requests for automobile insurance quotes. The task was to predict which of the requests lead to purchases of the respective insurances. The competitors were given from Saturday morning to Sunday afternoon to find the best solution. To help the teams with this challenge, a range of mentors including myself aided the teams throughout the challenge with helpful hints and tricks.

Only a few hours after the challenge’s inception the first predictions were submitted. Just like Kaggle, one of the most popular platform for machine learning competitions, the results were posted on a leader board and updated throughout the challenge.

Towards the evening of the first day the prediction methods became increasingly sophisticated, especially feature engineering, where the teams used their understanding of the problem to create new features from existing variables. This included some complex relationship graph metrics generated from the hundreds of thousands of data points. The four student teams, progressed quickly and still had enough time to take advantages of the facilities while developing their solutions, with some opting to sleep in shifts through the night to maximise their development time.

On the finishing straight

The next morning was very exciting because the differences in scores between the top teams were very close, so everyone started to act tactically. Every team wondered whether their own good results and the results of the other teams were at risk of being overfitted and how well their model would fare against the other ones. The tension peaked when the models were finally evaluated on a data set known only to the organisers. With a razor-thin advantage, a Russian team won first place due to their best prediction quality closely followed by the Cambridge team who came second.  However, the other teams also presented themselves in great shape and achieved remarkable successes.

A great weekend

The weekend was capped with the award ceremony, and we had a relaxed evening and a comfortable night at Les Fontaines. Besides networking, many attendees took the opportunity to relax in the surrounding park, the swimming pool and during other sports activities. Everyone involved (competitors, mentors and sponsors) agreed that we had all learned a lot and had a great deal of fun.

As I reflect on the great memories of the Data Science Game 2016, I look forward to taking part again in this year’s challenge and support the competitors as one of the Capgemini mentors, helping them improve their prediction quality even further.

The registration deadline for the qualification round is the 9th of April, so there is not much time left to apply and take part

All information about the Data Science Game including the detailed conditions of participation can be found here: www.datasciencegame.com