: Spark/Scala/PySpark developer who knows how to fully exploit the potential of our Spark cluster. Should have ability to clean, transform, and analyze vast amounts of raw data from various systems using Spark to provide ready-to-use data. Responsibilities: Create Scala/Spark/Pyspark jobs for data transformation and aggregation Produce unit tests for Spark transformations and helper methods Write Scaladoc-style documentation with all code Design data processing pipelines Skills: Pyspark Scala (with a focus on the functional programming paradigm) Apache Spark 2.x, 3.x -Apache Spark RDD API -Apache Spark SQL DataFrame API -Apache Spark Streaming API Spark query tuning and performance optimization SQL database integration (Postgres, and/or MySQL) Experience working with HDFS, AWS ( S3, Redshift, EMR , IAM , Polices , Routing) CI-CD Pipleline, Jenkins, Gitlab /Bitbucket Deep understanding of distributed systems (e.g. partitioning, replication, consistency, and consensus)
Employement Category:
Employement Type: Full time Industry: IT - Software Role Category: Embedded / System Software Functional Area: Not Applicable Role/Responsibilies: Data Streaming (java+Spark+Pyspark+Scala) +
Contact Details:
Company: Gharondaa Advisors Location(s): Multi-City, India