Insights & Data Blog

Insights & Data Blog

Opinions expressed on this blog reflect the writer’s views and not the position of the Capgemini Group

Agile in Data Science: ensuring Assurance Scoring answers the 'ask'

The move towards ‘digital by default’ in public services has led to an increase in the number of digital end-to-end solutions and, with these, opportunities to integrate insight and efficiencies from data science into the processes. For instance, the systems that sit behind the online application process, automated risk assessment and decision on credit card applications have obvious equivalents in public sector application processes - and these stand to benefit considerably from the Assurance Scoring methods that my colleagues have introduced in previous blogs in this series. 

In this blog I’d like to focus on the absolute need to tailor each application of these techniques to the individual use case and how we begin to do this in an Agile way – because the mere application of these methods is not a panacea. The first key to delivering a solution that adds not only immediate business value but also achieves maximum potential is in the appreciation of the exact requirements of the range of stakeholders, and a clear understanding of the existing business process and the target operating model.

 

Integrating with Agile processes

So how do we extract these requirements and what do they look like? As many of our clients practice an Agile methodology for these end-to-end solutions, if we are to best integrate and underpin their delivery we most often align ourselves to the same, including the ceremonies like daily stand ups. With Agile comes the familiar ‘As a… I want… So that…’ User Story template for requirements, but to adapt this to capture the needs of an analytical solution and all the assurances, policy implications and as-yet-unknowns that come with it, we add some twists.

The ‘As a…’ still refers to users, although these span the whole stakeholder range and commonly include multiple consumers of a service such as decision-makers, perhaps a triage team, their seniors for MI or team structuring purposes, plus data scientists who may need to maintain the solution going forward. Compliance stakeholders should be considered if variables used in Assurance Scoring could be considered contentious and security stakeholders should be included if bulk data is to be made available externally in outputs, or if features such as role-based access control (RBAC) are required.

The ‘I want…’ is what the users need to exploit or need at their disposal. It might cover how they need to see the output data such as through a technique (graph analytics, spatial analysis), an output type (table, dashboard, map, graph) or a specification of the level of detail they need returned. Security and compliance stakeholders need assurances around testing the solution. Senior stakeholders may specify a level of rigour or validation. Typically the ‘I want’ varies between discovery and Beta phases, potentially being set of ideas to try during a discovery, through to the exact specifics the stakeholders need for Beta or beyond. 

The ‘So that…’ may be a functional reason, benefit or rationale. For instance a particular output type may have been described so that it can be integrated into existing working practices or systems, or so that interpretability is improved reducing decision-making time and improving customer service levels. Certain visualisation methods such as map outputs can help prioritise resource, or identify spatial patterns that wasn't previously possible. Security and compliance User Stories are likely to relate to assurances that these teams can then pass on to stakeholders, partners or data sharers, contributing to the reputation and potentially longevity of the solution.

 

Iterative requirements and development

Well-defined User Stories capture the common understanding between developers and Product Owners – and rather then being fixed, this is a series of working assumptions that are constantly under review. The coaxing of needs and specifics naturally extends through Inception and Elaboration phases, with the effect of continuously derisking the project. In Assurance Scoring, particular emphasis is placed during these phases on defining the target variable, such as what is it we’re segregating, how this segregation is captured in the business data and what if any risks or opportunities there are around feedback loops. Then, in construction, as the Data Scientist works to rapidly prototype outputs aligned to the current User Stories, new opportunities and ideas almost definitely arise that can affect the trajectory of development. At each of these points the User Stories should be amended to reflect the chosen direction, and the relevant decision added to the RAID log.

 A MoSCoW rating (Must/Should/Could/Won’t) alongside each User Story is useful in prioritising features when you’re working on a time-bound discovery for instance. I’ve found that Product Owners rarely offer User Stories that aren’t ‘Must’ on a first pass, but under frequent review, and when they help prioritise development into sprints, a hierarchy soon emerges that can be reflected as ‘Could’ or ‘Should’ in the MoSCoW. Understandably Product Owners most keenly describe the MoSCoW positives – ‘Must', ‘Should' and ‘Could' – so there is therefore a very important role for the Data Scientist in eliciting and recording the upper-bound of these stories, as well as in defining the relevant ‘Won’t' on which the assumptions and timelines are planned. Surfacing these limits so that they can be shared and understood minimises any later misunderstanding, and, being Agile, can always be reviewed in due course anyway.  

 

Achieving the right balance

It is my experience that this Agile approach strikes a good balance between pace and record, whilst maintaining tangible targets to aim for. There’s a strong case that the relevance of the solution and therefore the impact it makes on the business transcends the code, machine learning or graph techniques that underly it. And if that is the case, then the significance of understanding and capturing requirements in an ongoing process should not be underestimated.

 

Assurance Scoring: an Agile example

This look at Agile analytics is one part of a series of blogs looking at the approaches and techniques used by the Capgemini data team whilst implementing our Assurance Scoring Framework. These techniques are drawing considerable attention as they provide the ability to prioritise and segregate a population into, for example, likely non- and fraudulent groups. Matt’s blog focuses on integrating multiple analytics techniquesNatalia shares her experience of using machine learning algorithms and Kannan discusses public sector use cases. Watch out for the next installment in this series, when Tom Sinadinos delves into one of the analysis techniques in the framework further: network analysis. 

 

Find out more

If you’d like to be part of the Capgemini team and the interesting and varied work in this area, we have a number of outstanding opportunities available. You are welcomed to browse the job specifications and application links for the Data ScientistBig Data EngineerBig Data Analytics Architect, and Data Visualisation Analyst roles.

 

About the author

Toby Gamm
Toby Gamm
Toby is a Data Science and Analytics consultant at Capgemini. He leads the delivery of solutions that provide insight, risk reduction, efficiency uplifts or new capability, from the initial capture of requirements through to the delivery of a service to be consumed by end users.

Leave a comment

Your email address will not be published. Required fields are marked *.