With more than 170,000 people in over 40 countries, Capgemini is one of the world’s foremost providers of consulting, technology and outsourcing services. The Group reported 2014 global revenues of EUR 10.5 billion. Together with its clients, Capgemini creates and delivers business and technology solutions that fit their needs and drive the results they want. A deeply multicultural organization, Capgemini has developed its own way of working, the Collaborative Business ExperienceTM, and draws on Rightshore®, its worldwide delivery model.
Learn more about us at http://www.capgemini.com/ .
Rightshore ® is a trademark belonging to Capgemini
Capgemini is an Equal Opportunity Employer encouraging diversity in the workplace. All qualified applicants will receive consideration for employment without regard to race, national origin, gender identity/expression, age, religion, disability, sexual orientation, genetics, veteran status, marital status or any other characteristic protected by law.
Title: Hadoop Architect / Developer
Design and Build distributed, scalable, and reliable data pipelines that ingest and process data at scale and in real-time.
Collaborate with other teams to design and develop data tools that support both operations and product use cases.
Source huge volume of data from diversified data platforms into Hadoop platform
Perform offline analysis of large data sets using components from the Hadoop ecosystem.
Evaluate big data technologies and prototype solutions to improve our data processing architecture.
Knowledge of Private Banking & Wealth Management domain is an added advantage
10+ years of hands-on programming experience with 3+ years in Hadoop platform
Experience designing and architecting Hadoop based platforms for building Data Lakes
Knowledge of various components of Hadoop ecosystem and experience in applying them to practical problems
Proficiency with Java and one of the scripting languages like Python / Scala etc.
Flair for data, schema, data model, how to bring efficiency in big data related life cycle
Experience building ETL frameworks in Hadoop using Pig/Hive/Map reduce
Experience in creating custom UDFs and custom input/output formats / serdes
Ability to acquire, compute, store and provision various types of datasets in Hadoop platform
Understanding of various Visualization platforms (Tableau, Qlikview, others)
Experience in data warehousing, ETL tools , MPP database systems
Strong object-oriented design and analysis skills
Excellent technical and organizational skills
Excellent written and verbal communication skills
Top skill sets / technologies:
Java / Python / Scala
Unix/ETL /DATAWAREHOUSE/SQL knowledge
Sqoop/Flume/Kafka/Pig/Hive/(Talend or Pentaho or Informatica or similar ETL) / HBase / NoSQL / MapReduce/Spark
Data Integration/Data Management/Data Visualization experience