Exp: 4 to 6 years
Job Location: Noida / Mumbai / Pune / Bangalore / Gurgaon / Kochi ( Hybrid work)
Notice : Immediate to 30 days
Skill set : ADF , Pyspark , SQL
Role & responsibilities
Key Responsibilities:
Develop scalable data pipelines using Azure Data Factory (ADF), Databricks, PySpark, and Delta Lake to support ML and AI workloads.
Optimize and transform large datasets for feature engineering, model training, and real-time AI inference.
Build and maintain lakehouse architecture using Azure Data Lake Storage (ADLS) & Delta Lake.
Work closely with ML engineers & Data Scientists to deliver high-quality, structured data for training Generative AI models.
Implement MLOps best practices for continuous data processing, versioning, and model retraining workflows.
Monitor & improve data quality using Azure Data Quality Services
Ensure cost-efficient data processing in Databricks using Photon, Delta Caching, and Auto-Scaling Clusters.
Secure data pipelines by implementing RBAC, encryption, and governance
Required Skills & Experience:
3+ years of experience in Data Engineering with Azure & Databricks.
Proficiency in PySpark, SQL, and Delta Lake for large-scale data transformations.
Strong experience with Azure Data Factory (ADF), Azure Synapse, and Event Hubs.
Hands-on experience in building feature stores for ML models.
Experience with ML model deployment and MLOps pipelines (MLflow, Kubernetes, or Azure ML) is a plus.
Good understanding of Generative AI concepts and handling unstructured data (text, images, video, embeddings).
Familiarity with Azure DevOps, CI/CD for data pipelines, and Infrastructure as Code (Terraform, Bicep).
Strong problem-solving, debugging, and performance optimization skills.
Preferred candidate profile
Interested candidates , kindly share updated resume at si*********i@in*****n.com
Keyskills: Azure data factory databricks pyspark ADF datafactory Azure Databricks Azure Data Lake SQL
Infogain is a Silicon Valley headquartered company with software platform engineering and deep domain expertise in the travel, retail, insurance and high technology industries. We accelerate the delivery of digital customer engagement systems using digital technologies such as cloud, mic...