About Capgemini

With more than 170,000 people in over 40 countries, Capgemini is one of the world’s foremost providers of consulting, technology and outsourcing services. The Group reported 2014 global revenues of EUR 10.5 billion. Together with its clients, Capgemini creates and delivers business and technology solutions that fit their needs and drive the results they want. A deeply multicultural organization, Capgemini has developed its own way of working, the Collaborative Business ExperienceTM, and draws on Rightshore®, its worldwide delivery model.

Learn more about us at http://www.capgemini.com/ .


Rightshore ® is a trademark belonging to Capgemini

Capgemini is an Equal Opportunity Employer encouraging diversity in the workplace. All qualified applicants will receive consideration for employment without regard to race, national origin, gender identity/expression, age, religion, disability, sexual orientation, genetics, veteran status, marital status or any other characteristic protected by law.


Title: Hadoop Architect / Developer 


Job Responsibilities:

Design and Build distributed, scalable, and reliable data pipelines that ingest and process data at scale and in real-time.

Collaborate with other teams to design and develop data tools that support both operations and product use cases.

Source huge volume of data from diversified data platforms into Hadoop platform

Perform offline analysis of large data sets using components from the Hadoop ecosystem.

Evaluate big data technologies and prototype solutions to improve our data processing architecture.

Knowledge of Private Banking & Wealth Management domain is an added advantage

Candidate Profile:

10+ years of hands-on programming experience with 3+ years in Hadoop platform

Experience designing and architecting Hadoop based platforms for building Data Lakes

Knowledge of various components of Hadoop ecosystem and experience in applying them to practical problems

Proficiency with Java and one of the scripting languages like Python / Scala etc.

Flair for data, schema, data model, how to bring efficiency in big data related life cycle

Experience building ETL frameworks in Hadoop using Pig/Hive/Map reduce

Experience in creating custom UDFs and custom input/output formats / serdes

Ability to acquire, compute, store and provision various types of datasets in Hadoop platform

Understanding of various Visualization platforms (Tableau, Qlikview, others)

Experience in data warehousing, ETL tools , MPP database systems

Strong object-oriented design and analysis skills

Excellent technical and organizational skills

Excellent written and verbal communication skills


Top skill sets / technologies:

Java / Python / Scala


Sqoop/Flume/Kafka/Pig/Hive/(Talend or Pentaho or Informatica or similar ETL) / HBase / NoSQL / MapReduce/Spark

Data Integration/Data Management/Data Visualization experience



Apply now