Machine learning for graphs

Capgemini

12 Jul 2021

Machine Learning (ML) practitioners are learning how to successfully apply the most recent advances in ML to Graph-based datasets over a wide range of use-cases. This blog aims to introduce the basics of Graph ML together with the most powerful and successful recent applications.

Introduction

It has long been appreciated that many data structures are best represented by a graph; where a graph in this context is defined to be a set of discrete nodes connected to each other via edges.

Examples of graphs, their applications, and graph datastores can be found in previous Blogs from the Capgemini Graph Guild, e.g. Introducing Capgemini’s UK Graph Guild.

The last decade has seen an explosion in the capabilities and application of Deep Machine Learning (ML) across an ever-increasing range of use-cases, notably advances in Computer Vision (e.g. Google Reverse Image Search), Speech Recognition & Translation (e.g. Amazon’s Alexa, Apple’s Siri, Google Translate and many others), and complex games such as Go (e.g. Google DeepMind’s AlphaGo, AlphaZero, MuZero, etc.).

These advances have largely been achieved using multi-layered neural networks (NNs), the most widely used method of Deep ML, spurred on by improved understanding of optimal architectures and availability of cheap yet powerful compute.

More recently researchers have begun to explore how NNs can be applied to Graph-based datasets with success across a range of extremely interesting use-cases – using so-called Graph Neural Networks (GNNs).

Graph Neural Networks

A basic NN is designed to act on input data in the form of collections of 1D vectors of numbers, so how can we apply them to graphs, which are manifestly not 1D vectors? Well, we could take inspiration from Computer Vision (CV), where in the simplest case, the input data is a 2D grid of numbers. A typical CV DNN consists in part of layers of convolution filters, which are trained during the training process to generalize over the pixel arrays in the dataset by convolving over neighbouring pixels, such that the final layers of the NN can learn how to detect e.g. cat’s whiskers that enable it to correctly classify an image as that of a cat. The basic idea is shown in Figure 1.

**Figure 1: Convolutional filter in a computer vision neural network**

It turns out we can apply a similar approach to graph datasets: for each Node in our network, we can convolve feature data over its neighbours such that the GNN can generalize over properties of every Node’s neighbours. And by adding further layers to our GNN we can generalize over each Node’s neighbours’ neighbours, and so on, increasing the GNN’s ability to generalise over the structure between its Nodes. This process is shown in Figure 2.

Figure 2: Figure 2. Convolutional filter in a graph neural network — **Figure 2: Convolutional filter in a graph neural network**

Already with a basic GNN such as described above, we can achieve useful performance in node-feature, edge-feature, and graph-feature classification & regression.

But we can go further – by adding to our GNN a feature from many of the most advanced models in Natural Language Processing, called Attention, our GNN can learn to weight the features of some of each Node’s neighbours by a different amount, greatly increasing its learning capacity and accuracy.

Example Practical Use Cases

Compared to CV and NLP, the use of Graph Neural Networks is still in its infancy; nevertheless, they have been successful in some powerful use-cases, such as:

Antibiotic discovery: A Deep Learning Approach to Antibiotic Discovery
Researchers were able to generate a completely novel antibiotic molecule Halicin, by representing candidate drug molecular structures as graphs and training a GNN on existing examples of antibiotic molecular structures to predict how effective each new molecule could be.
Enhanced traffic level/travel time prediction: Traffic prediction with advanced Graph Neural Networks Google/DeepMind exploited GNNs to improve their traffic level/ETA estimation algorithm within Google Maps by representing road maps and historical traffic levels on graphs and using a GNN to make better estimates of future traffic levels and so improve ETA calculation.
Financial transaction monitoring/fraud detection
Graphs are already being used aggressively to gain insight into the nature of financial transactions between parties and prevent crime, but using GNNs enables even greater capabilities in crucial areas such as detection of fraudulent/money-laundering transactions, community discovery, and anomaly detection even in the absence of large, well-curated training datasets.

The Future

Graph-based machine learning is an extremely active area of academic research that is very much in its infancy. There is a wide range of applicable use-cases; those described above, but also Knowledge Graph construction, superior Recommender Systems, and Supply Chain optimization to name a few.

The Capgemini Graph Guild is actively exploring and exploiting GNNs for the future benefit of our clients. Please get in touch with us if you would like to learn more or need help solving your own graph-based challenges via this email address: idgraphguild.uk@capgemini.com.