Skip to Content

Green data – The sustainable foundation of enterprise

Arne Rossmann
14th March 2024

Imagine a future where enterprises don’t just aim to become data powerhouses but do so sustainably, ensuring both technological advancement and planet preservation.

Tapping into the vast data sources for improved products and digital services is crucial. Yet, in our race for innovation, sustainability emerges as a pivotal cornerstone, safeguarding both our planet and a company’s future relevance. The secret sauce? A sustainable data value chain. Dive in as we explore the essence of green data, drawing insights from the Green Software Foundation.

In recent years, enterprises followed the goal of becoming data-powered enterprises to leverage the full potential of data for their value chain. Levering the insights from all the sources of relevant data to create improved and new products and additional (digital) services is the top priority for enterprises. But sustainability has become a main goal for businesses too, to preserve the planet and the company’s relevance in the next decades.

The Greenhouse Gas Protocol defines three scopes (scope 1, scope 2 and scope 3) to delineate direct and indirect emission sources, improve transparency, and provide utility for different types of organizations and different types of climate policies and business goals. Through this framework companies can define and manage their emissions.

But this is only the first step. With clarification of emissions within the three scopes, companies get transparency on what’s happening within the value chain and clarity on where to reduce. But the main challenge is to know how to reduce their emissions.

Here, the Green Software Foundation has defined six principles to be applied in software development:

  1. Carbon efficiency: Emit the least amount of carbon possible.
  2. Energy efficiency: Use the least amount of energy possible.
  3. Carbon awareness: Do more when the electricity is cleaner and do less when the electricity is dirtier.
  4. Hardware efficiency: Use the least amount of embodied carbon possible.
  5. Measurement: What you can’t measure, you can’t improve.
  6. Climate commitments: Understand the exact mechanism of carbon reduction.

For each of the six principles, examples on achieving them are available, especially on the area of carbon awareness. Not only have the big hyperscalers (AWS, Azure, Google) made this topic a top priority, but also dedicated smaller solutions can be found. Two innovative examples are Green Mountain, which provides 100 percent renewable energy sourced data centers for co-location in Norway, and windCORES, which helps companies deploy small, co-location data centers in wind turbines, provided by 100 percent renewable energy and maximizing the used space from the wind turbines. With Green Data Engineering, a first view on how to apply these principles to data engineering have been laid-out.

But one question remains: how can companies aiming to be data-powered enterprises do this in a sustainable way? The answer sounds simple and complex in the same way: apply the principles of the GHG framework towards the data value chain and make the carbon footprint of data products and use cases transparent.

This is not as complicated as it sounds; most information is already available.

With the Carbon Aware SDK  from the Green Software Foundation and the Sustainability APIs, SDKs, and dashboards from hyperscalers, it is possible to calculate the carbon footprint of applications and processes. As an example, the Azure Sustainability Manager provides a comprehensive overview with multiple reports on the customer landscape running on Azure. But this is limited to one cloud. What about the more common example of customers running multi-cloud environment strategy?

Modern applications are composed of many smaller pieces of software (components) running on many different environments, for example: private cloud, public cloud, bare-metal, virtualized, containerized, mobile, laptops, and desktops.

Every environment requires a different model of measurement, and there is no single solution to calculate the environmental impacts for all components on all environments.

To achieve this, the Green Software Foundation has incubated the Impact Framework (IF). The IF is a framework to Model, Measure, simulate, and Monitor the environmental impacts of software. It allows you to define a calculation manifest file, a YAML file which describes the calculation of emissions. So rather than just saying “Carbon is X” you can say “Carbon is X and here is all the data, all the working out, and all the assumptions and models that we used.” You can run the YAMLs to confirm a claim and if you don’t agree with some of the data, models, and assumptions, you can change and run it again to see how that alters the value.

IF represents the carbon footprint of different components in a graph to aggregate the information and draw dependencies and interconnections.

  • Configuration describes shared information regarding this component and, most importantly, parameters required by this model.
  • Observations are a time series of data points used as inputs to the model.
  • Model is a plugin which when given some configuration and a series of observations can calculate the impact, e.g. carbon impact from an observation of CPU utilization.

With this approach, it’s possible to aggregate up the carbon footprint of software components of applications easily. And by proper application of portfolio management, the mapping of application-based carbon footprint along the value chain is mainly pure calculations.

We might wonder how about data products and use cases? Didn’t we want to be data-powered? Sure, just as a small recap: The “data product, the architectural quantum is the node on the mesh that encapsulates three structural components required for its function, providing access to the domain’s analytical data as a product as Martin Fowler mentions in his article. They are:

  • Code
  • Data and metadata
  • Infrastructure

Sounds familiar? Right, it’s easily comparable to any other application. Therefore, the transfer of the Impact Engine Framework towards a Data Mesh approach is not as hard as it sounds.

And with that, companies have the right tooling in place to ensure ESG compliance for their data-powered enterprise journey. And as the whole value chain transformation towards more digital services and products continues, the importance of mapping their carbon footprint along the data value chain is essential. Not only to be compliant with ESG reporting, based on the scope 3 disclosures required under the European Union’s Corporate Sustainability Reporting Directive, which comes into force January 2024, but also to maintain the Race to Zero. The race is still on, and it’s a data-powered race.

“LEVERING THE INSIGHTS FROM ALL THE SOURCES OF RELEVANT DATA TO CREATE IMPROVED AND NEW PRODUCTS AND ADDITIONAL (DIGITAL) SERVICES IS THE TOP PRIORITY FOR ENTERPRISES. BUT SUSTAINABILITY HAS BECOME A MAIN GOAL FOR BUSINESSES TOO, TO PRESERVE THE PLANET AND THE COMPANY’S RELEVANCE IN THE NEXT DECADES.”

INNOVATION TAKEAWAYS

COMPLIANCE = TRANSFORMATION

Enterprises need to comply with the EU’s Corporate Sustainability Reporting Directive, which has an impact on the transformation towards more digital services and products.

THE SOLUTIONS ARE IN THE CLOUD

Hyperscalers provide solutions within their environments to tackle carbon footprint.

AN OPEN FRAMEWORK

With the IEF by the Green Software Foundation, a framework for overarching carbon impact calculations exists.

Interesting read?

Capgemini’s Innovation publication, Data-powered Innovation Review | Wave 7 features 16 such fascinating articles, crafted by leading experts from Capgemini, and partners like Aible, the Green Software Foundation, and Fivetran. Discover groundbreaking advancements in data-powered innovation, explore the broader applications of AI beyond language models, and learn how data and AI can contribute to creating a more sustainable planet and society.  Find all previous Waves here.

Arne Rossmann

Chief Architect Data & AI for Intelligent Industry
As a part of the Data & AI for Intelligent Industry team working as a Chief Architect, I support our clients by giving them advice and guidance on the architectures for Data & AI Platforms within the domains of Digital Manufacturing, Digital Twin, Intelligent Supply Chain, Connected Products and 5G & Edge, and this across all our sectors. The main goal of my work is to enable our clients on their journey towards data-powered enterprises to leverage the value lying within their data and by sharing them across the company and with the outside network of suppliers and partners.

Asim Hussain

Executive Director, Green Software Foundation
Asim is a developer, trainer, author and speaker with over 24 years’ experience working for organizations such as the European Space Agency, Microsoft and Intel. He is the Executive Director of the Green Software Foundation which he co-founded in 2021 an industry consortium of over 60 member organizations working to change how we build software, so there are zero harmful environmental effects.