Role: Hadoop with Java & Spark Development
Exp: 4 to 6 Years
• Candidate should have good understanding of BigData/Hadoop technologies (Cloudera). should be good in developing the Unix/Shell/PL/SQL, SCALA framework.
• Expertise in java/J2EE and big data technologies like Hadoop , Apache Spark and Hive is required. Must have applied these skills continuously in the last 2-3 years.
• Good knowledge in Python skills and have used Machine learning in solving data science problems in at least 2-3 projects.
• Industry experience: Insurance Auto industry is preferred but not a must.
• Education: Technical degree like Engineering/Computer science/IT is preferred.
• The above are only minimum skillset and any additional relevant knowledge and experiences in Big Data technologies and visualization tools like Tableau will be a definite plus for this position.
• Selecting & integrating any Big Data tool & framework required to provide requested capabilities
• Designing ETL process
• Monitoring & evaluating performance and advising any necessary infrastructure changes including changing the cloud platform
• Defining data retention policies
• Proficient understanding of distributed computing principles
• Management of Hadoop cluster, with all included services
• Ability to solve any ongoing issues with operating the cluster
• Proficiency with Hadoop v2, MapReduce, HDFS
• Experience with building stream-processing systems, using solutions such as Storm or Spark-Streaming
• Good knowledge of Big Data querying tools, such as Pig, Hive, and Impala
• Experience with Spark and NoSQL databases, such as HBase, Cassandra, MongoDB
• Experience with integration of data from multiple data sources
• Knowledge of various ETL techniques and frameworks, like Flume
• Experience with Cloudera/MapR/Hortonworks