For the latest annual Global Data Science Challenge, Capgemini colleagues around the world were tasked with helping the Lofoten-Vesterålen (LoVe) Ocean Observatory in Norway identify anomalous data recorded by its ocean-sensing equipment. Here, both winning teams explain how they answered this call to action.
Deep underwater, off the coast of Norway’s Lofoten archipelago, a scientific sensor array listens and records. For 24 hours a day, seven days a week, the Lofoten-Vesterålen (LoVe) Ocean Observatory produces a stream of chemical, physical, and biological readings.
Hidden within this data are the songs of humpback whales, the vibrations of vast shoals of migratory herring, and clues to a changing global climate. There’s just one difficulty: how to detect scientifically important events in this ocean of data.
The global callout
This was the challenge thrown down to Capgemini colleagues around the world through the latest Global Data Science Challenge (GDSC). In this annual company-wide internal competition, hundreds of employees compete to solve real-world challenges using artificial intelligence.
In 2020, entrants harnessed AI and machine learning to identify individual sperm whales, in order to monitor migration patterns and protect the whales’ natural habitats.
This year, from the 673 teams – 1,200 Capgemini colleagues – that entered the competition, two emerged victorious to share the top prize: one based in India and the other in the UK.
An ocean of data
Anupam Saha, senior delivery manager at Capgemini India, leader of the Indian team, explains the challenge. “We were asked to build an AI solution that could analyze the reams of sensor data collected by the LoVe observatory and detect the anomalies that will direct further study,” he says.
David Gilhooley, principal consultant and engagement manager, Capgemini UK, who led the other winning team, adds that the sheer volume of data presented a technical challenge in itself: “This is the crux of the problem: the observatory is collecting masses of data showing the ocean being ‘normal’, whereas it’s the anomalies that are interesting.”
Anupam’s team addressed the problem by breaking down the data into pieces. “We tackled each data source individually, designing a model that would identify the outliers in each set. Therefore, much of our focus was on the data pre-processing, to determine the most relevant variables among the thousands we were presented with.”
The UK team also prioritized careful data management. “We had to take these multiple data sources and organize them day by day,” explains David. “As well as padding out the missing areas, we had to normalize the data in order to deploy the machine-learning analysis correctly.”
Learning new skills
David’s team had dabbled in data processing and machine learning in small-scale academic settings, so they viewed the competition as an opportunity to sharpen their skills in a real-world scenario.
“This is the type of problem you’d get in an industrial setting – searching for outliers in a huge dataset with lots of variables,” he says. “From that perspective, the challenge was really practical. We were excited to learn the AWS [Amazon Web Services] machine-learning tools, and getting experience with this technology was extremely useful.”
Making the team work
David’s team included Vincent Malmedy, Gabriela Pomery, and Andrew Pennington, who are based at the Capgemini office in Bristol. “The competition was the perfect way to knit the team back together after all the pandemic disruption,” says David.
For Anupam, his proudest moment was his team’s presentation to the judges. “Before this, we were ranked in fifth place,” he explains. “But we had realized the person using our model at the observatory might not have a background in data. Therefore, we avoided being too theoretical with our presentation, and I think this helped us claim joint first place overall.”
While these technologies shouldn’t be viewed as a ‘silver bullet’, David believes that AI and machine learning are well suited to helping us understand climate change and global warming. “It’s hard for human beings to understand all the incremental steps involved in these huge processes. Given the right instructions, however, machine-learning tools can help navigate these complexities.”
He believes that the competition, alongside Capgemini’s broader climate commitments, has made his team more aware of the choices they make – for example, in terms of how they commute to work and how much plastic they use.
Gearing up for the next challenge
Both teams received a technology prize for their winning entries, in lieu of a trip to Norway to visit the observatory, which, unfortunately, had to be canceled because of the pandemic. However, Capgemini’s team in Germany is continuing its work with the LoVe observatory to integrate the winning solutions into its platform, enabling a large community of researchers to benefit from a broader understanding of oceanic ecosystems.
Next year, the GDSC will focus on finding a cure for river blindness in what will be another opportunity to shape a better future using AI and machine learning.