PySpark | 4 to 6 years | Bengaluru

Job Description
  • Must have hands on experience implementing AWS Big data lake using EMR and Spark
  • Working experience with Spark Hive Message Queue or Pub Sub Streaming technologies for 3 years
  • Have 6 years of experience developing data pipelines using mix of languages Python Scala SQL etc and open source frameworks to implement data ingest processing and analytics technologies
  • Experience leveraging open source big data processing frameworks such as Apache Spark Hadoop and streaming technologies such as Kafka
  • Hands on experience with newer technologies relevant to the data space such as Spark Airflow Apache Druid Snowflake or any other OLAP databases
  • Experience developing and deploying data pipelines and real time data streams within a cloud native infrastructure preferably AWS
Primary Skills
  • PySpark
  • AWS
Secondary Skills
  • Experience in using CI CD pipeline Gitlab
  • Experience in Code Quality implementation
  • Used Pep8 Pylint tools or any other code quality tool
  • Experience of Python Plugins operators like FTP Sensor Oracle Operator etc



Posted on:

August 21, 2020

Experience level:


Contract type: