Python Data Scientist

  • Working experience in various AI powered model implementation with NLP  Deep Learning algorithms    
  • Experiences in unstructured text data analysis  language  speech  image and video data analysis across multiple industries e g  manufacturing  retail  etc    
  • Develop connection from Spark streaming to Kafka Flume using Python    
  • Examine Streaming performance and provide optimal and precise development using Python PySpark  includes connecting to Data  Structured or Unstructured   Extraction  Cleaning    
  • Develop Models classification or clustering using MLib or Anaconda    
  • Gather  evaluate and document business requirements related to analytics  translate to analytics solution definition  and ability to implement using Python or Scala    
  • Data extraction from Raw files using Python Anaconda or built in for POC    
  • Data pulling or creating from different sources such as HBase  Hive  Impala or MongoDB    
  • Responsible for analyzing data from multiple data sources  DBs  flat files  etc    and building predictive models using Python      
  • Linux shell scripting with Python and cron jobs to schedule the run  Batch or Real time    
  • Scala knowledge is preferable in some cases    
  • Different models and their performances in Real time and Batch developed using Python  Pandas  MLib PySpark and opting the better solution depending on the cases    
  • Validate the models   statistically as well as from business perspective in discussions with business stakeholders    
  • Ability to support and guide model deployment and model lifecycle management    
  • Create model documentation as per client  regulatory standards      
  • Degree in a quantitative field  Math  Statistics  Economics  Computer Science  and  or Engineering  MBA      
  • Experience and skilled Python  incl  PySpark  Spark  MLib    
  • Hands on experience in analytical techniques including sampling  clustering  decision trees  forecasting  SVM  Random Forest and linear  logistic regression    
  • Hands on experience in Python  PySpark  MLib  Spark Mesos    
  • Hands on experience using Hive  Hbase  Impala    
  • Knowledge on Kafka and Flume is a plus    
  • Data exploration using OpenCV  NumPy  Matplotlib  SciPy and Pandas for image analysis    
  • Good Knowledge on Python based libraries e g  Keras   Tensor flow    Knowledge on Scala will be beneficial    
  • Working with AWS  Cloudera Horton Works    Knowledge of where analytics fits in to an end to end business solution    
  • Ability to work with business and technology teams to build and deploy an analytical solution as per client needs    
  • Ability to multi task  solve problems and think strategically    
  • Strong communication and collaboration skills    
  • Working experience in various other data science technologies e g  R  SAS  SPSS  Matlab are also preferred



Posted on:

January 3, 2019

Experience level:

Experienced (non-manager)

Education level:

Bachelor's degree or equivalent

Contract type:





Business Information Management


By continuing to navigate on this website, you accept the use of cookies.

For more information and to change the setting of cookies on your computer, please read our Privacy Policy.


Close cookie information