With enterprises jumping on the bandwagon of data being the corporate gold, it’s more crucial than ever to understand where it comes from; not only from internal, but also from external and maybe even as synthetic, generated datasets. This requires a sharp market eye that is typical for procurement to get the right data. It needs an R&D-like vision to design how it will produce value. And it taps into the external mindset of marketing to envision how to market and monetize data, internally and externally. And if data can be put on the corporate balance sheet, it will even activate the CFO and CEO perspectives. All aboard!
- For any organization aspiring to become data-driven, creating an inventory of data assets across the company is a must-do task. Through machine learning and AI, huge amounts of data stores
can be scanned to identify, catalog and manage data assets.
- Mapping external marketplaces, including brokers of open data and industry consortia, is equally key to identify data sources that – combined with own data – can lead to value-creating insights.
- Specialized suppliers of (tagged) training data can provide crucial input for machine learning purposes, especially when the organization does not hold such data itself.
- New ways of accessing training data without invading privacy – such as Google’s Federated Learning – open up opportunities for collecting data from previously unavailable sources.
- Simulated environments and reinforcement learning can provide crucial (synthetic) training data in cases where real-life data is insufficient or unavailable.
- An increasing amount of pre-trained models can be purchased on the market, eliminating the need for collecting high volumes of training data in the first place.
- Data becomes a corporate asset when it provides measurable economic benefits, increases shareholder value or adds to the corporate purposes; therefore, there is a reasonable case for
measuring its economic value on the corporate balance sheet.
- Airbus’ “Skywise” cloud data platform allows airlines and other aeronautics players to store, manage and analyze their data and that of their ecosystem more efficiently. It maximizes the availability of a fleet of aircraft, increasing the operational and economic performance of an airline.
- Chevron, Schlumberger and Microsoft launched an initiative to create a shared platform for “petrotechnical” data and AI, aligning with the Open Group’s Open Subsurface Data Universe (OSDU) Data Platform standards.
- One of the largest automotive parts suppliers, Continental has developed an AI-based virtual simulation program, which generates 5,000 miles of vehicle test data per hour. The same test data would take over 20 days of physical effort. (Capgemini Research Institute)
- 890 by Capgemini offers public, private and community marketplaces for curated datasets, providing companies ‘as-a-service’ access to data from external and internal sources, as well as the opportunity to monetize their own.
- The European Commission sees its member states aspiring to be much more effective at sharing data with other governments and organizations in a secure way, whilst respecting intellectual property and privacy. For this purpose, it started a new project, “Support Center for Data Sharing”.
- Creating the foundation for becoming a data-driven organization, serving many different potential objectives and purposes of the enterprise.
- Improving effectiveness and value-creation of existing business intelligence and analytics by adding external data.
- Shortened time to market for new analytics and AI solutions by tapping into external sources of (industry) training data.
- Monetization of own data and aggregated data.
- Data exchange platforms: Amazon Data Exchange, Oracle Data marketplace, 890 by Capgemini Data Exchange, Snowflake Data Exchange, Data Republic data sharing governance platform, Google Dataset Search
- Data brokers:
- Data exploration: Informatica Enterprise Data Catalog, Cloudera Navigator, Apache Atlas, Waterline Data Catalog, Microsoft Data Catalog, Collibra Collaborative Data Platform
- Data creation: ai training data for autonomous cars, Appen data annotation, Scale AI labeled, Lionbridge enterprise-level training data, Foxintelligence consumer intelligence, Mostly.ai synthetic data
- Data monetization: Gartner’s Information Asset Valuation Method Framework