Procuring open source software: Top 3 factors for analytics solutions

You use a Chrome or Firefox browser (probably on an Android phone) to use websites served by Apache, hosted on Linux, in order to interact with friends and colleagues who also rely on free and open source software (FOSS)…yet in the workplace it’s far more common to use Commercial-Off-The-Shelf (COTS) software which can be inflexible, expensive, outdated and contain unique features specific to a product.
The assumption that all FOSS is riddled with viruses and backdoors, only used by academics, charities, students and hobbyists is outdated and wrong …in most cases…but don’t throw away your software licenses and promotional gifts just yet (4GB USB stick anyone?).
This blog is to share the guiding principles I use to help clients combine open source and vendor software in their Enterprise analytics solutions. When defining your mix of FOSS and COTS products – mix being the operative word since over-reliance on one product or suppliers creates a single point of failure and expensive inflexibility – due diligence gives longevity to decisions which optimise cost of ownership and protect your data.  
  1. Technical Support
  • Does the software meet your user need ‘out of the box’?

    • Yes – great! But will this always be the case or will you want to customise in the future?
    • No – what is required to build on it to meet your need?
  • How easy is it to obtain technical support?

    • Supplier provides under license
    • Ecosystem of commercial organisations supporting product
    • Strong online user community [1]
    • Tricky to find and employ niche experts (any FORTRAN 77 programmers out there?)
  • How long has this software been operating?

    • Implying; is this an established, heavily used technology?
    • How often are updates released?
    • Is support growing or waning?
  • When will it become a legacy product?

    • Who else is invested in it?
    • Is it based on outdated or cutting edge technology?


  1. Legality / Policy
All software must be licensed (the Open Source Initiative has a useful guide).
You should consider the following:
  • Does your industry / organisation mandate certain software or standards (e.g. for security, auditing, collaboration)?
  • Will you use the software for commercial purposes?
  • Within your organisation or externally? 
  • Use it to build other software or services (e.g. JavaScript libraries for web)?

    • Will you be redistributing these?
    • Will you be charging a fee?


  1. Scalability
Virtual products can scale almost infinitely at negligible cost, meaning that your user base could increase exponentially overnight (compare Angry Birds whose throughput is only limited by the bandwidth of App Stores, with a cobbler whose throughput is limited by how quickly they can repair shoes). In a digital economy organisations must consider the implications for their stack of dramatic increases in user base as their App or service goes viral. Consider: 
  • Does the software impose technical limitations on user numbers?

    • Will you need to obtain more servers or bandwidth to mitigate?
  • What are the cost implications of increasing user numbers;

    • Will you move to a higher support tier?
    • Will your license costs increase linearly with user numbers?
  • Does the software enable outsourcing of:

    • Hosting (e.g. cloud)?
    • Maintenance (can you expect someone else to mange this software on your behalf)? 


Final Point
Start high level and avoid being drawn into a debate with arguments like:
“I once used…and it was very good”
“I had a lovely meal with the pre-sales team from…”
This blog could help you quickly narrow down your options and then it comes down to more nuanced points such as usability, skills in your organisation, existing supplier relationships, payback period (when do you need software to be live and how long is it likely to be live?), compatibility with legacy hardware/Operating System/software, etc.
I use a reference architecture that helps clients accelerate their decisions about the right technology architecture to use. Capgemini’s Assurance Scoring approach is a completely modular and customisable framework for open source and proprietary analytics tools on a cloud platform or on your own hardware.
Are you excited by the challenge of integrating open source and vendor provided software for clients? If you’re looking for opportunities to build analytic solutions, then why not apply to join our Data Science team! We work on interesting projects at the heart of clients’ insights-led digital operations. To get a feel for who we are, and the type of work we do, you can read our blogs and check out the Data Scientist and Big Data Engineering roles we’re recruiting for.
I work in the Capgemini UK Data Science team and next week my colleague Bhima Auro will be blogging about Data Science skills.
This blog is also published on LinkedIn Pulse.
[1] Don’t rule out Stack Exchange as a support mechanism – just be prepared for ridicule if you ask noob questions!

Related Posts

Healthcare industry

Injury analytics: what P&C insurers can learn from the healthcare industry

Rahul Dhingra
August 6, 2018

There are key areas like medical data analysis and provider data analysis from where insurers can...

Artificial Intelligence

Capgemini and Databricks partnership – multi-cloud Spark delivery at scale

Goutham Belliappa
July 13, 2018

Capgemini & Databricks – Multi Cloud Spark Delivery at scale. Meet our leaders Scott D Sweet,...

Data Science

Machine learning models, alternative data sources expand banks’ credit-scoreable population

Gunjan Aggarwal
June 14, 2018

With machine learning (ML) models, lenders can now directly implement algorithms that can assess customer...


By continuing to navigate on this website, you accept the use of cookies.

For more information and to change the setting of cookies on your computer, please read our Privacy Policy.


Close cookie information