Your browser does not support javascript! Please enable it, otherwise web will not work for you.

Data Engineer-Python,PySpark,SQL ,Spark @ Siemens

Home >

 Data Engineer-Python,PySpark,SQL ,Spark

Job Description

    As a Data Engineer, you are required to: Design, build, and maintain data pipelines that efficiently process and transport data from various sources to storage systems or processing environments while ensuring data integrity, consistency, and accuracy across the entire data pipeline. Integrate data from different systems, often involving data cleaning, transformation (ETL), and validation. Design the structure of databases and data storage systems, including the design of schemas, tables, and relationships between datasets to enable efficient querying. Work closely with data scientists, analysts, and other stakeholders to understand their data needs and ensure that the data is structured in a way that makes it accessible and usable. Stay up-to-date with the latest trends and technologies in the data engineering space, such as new data storage solutions, processing frameworks, and cloud technologies. Evaluate and implement new tools to improve data engineering processes. Qualification: Bachelor's or Master's in Computer Science & Engineering, or equivalent. Professional Degree in Data Science, Engineering is desirable. Experience level: At least 3 - 5 years hands-on experience in Data Engineering, ETL. Desired Knowledge & Experience: Spark: Spark 3.x, RDD/DataFrames/SQL, Batch/Structured Streaming Knowing Spark internals: Catalyst/Tungsten/Photon Databricks: Workflows, SQL Warehouses/Endpoints, DLT, Pipelines, Unity, Autoloader IDE: IntelliJ/Pycharm, Git, Azure Devops, Github Copilot Test: pytest, Great Expectations CI/CD Yaml Azure Pipelines, Continuous Delivery, Acceptance Testing Big Data Design: Lakehouse/Medallion Architecture, Parquet/Delta, Partitioning, Distribution, Data Skew, Compaction Languages: Python/Functional Programming (FP) SQL: TSQL/Spark SQL/HiveQL Storage: Data Lake and Big Data Storage Design additionally it is helpful to know basics of: Data Pipelines: ADF/Synapse Pipelines/Oozie/Airflow Languages: Scala, Java NoSQL: Cosmos, Mongo, Cassandra Cubes: SSAS (ROLAP, HOLAP, MOLAP), AAS, Tabular Model SQL Server: TSQL, Stored Procedures Hadoop: HDInsight/MapReduce/HDFS/YARN/Oozie/Hive/HBase/Ambari/Ranger/Atlas/Kafka Data Catalog: Azure Purview, Apache Atlas, Informatica Required Soft skills & Other Capabilities: Great attention to detail and good analytical abilities. Good planning and organizational skills Collaborative approach to sharing ideas and finding solutions Ability to work independently and also in a global team environment.,

Employement Category:

Employement Type: Full time
Industry: IT Services & Consulting
Role Category: Not Specified
Functional Area: Not Specified
Role/Responsibilies: Data Engineer-Python,PySpark,SQL ,Spark

Contact Details:

Company: Siemens
Location(s): Other Karnataka

+ View Contactajax loader


Keyskills:   Spark IDE Languages SQL Storage NoSQL Cubes SQL Server Hadoop

 Fraud Alert to job seekers!

₹ Not Specified

Similar positions

Linux L2 Job In Ntt Data, Inc. At Other

  • Consultancy
  • 5 to 9 Yrs
  • Other Maharashtra
  • 18 hours ago
₹ Not Specified

S&C Global Network - AI - Hi Tech - Data

  • Accenture
  • 3 to 7 Yrs
  • Other Karnataka
  • 2 days ago
₹ Not Specified

Data Analyst Job in Reliance Industries

  • Reliance Industries
  • 3 to 7 Yrs
  • Jamnagar
  • 3 days ago
₹ Not Specified

Siemens

We at Siemens Large Drives Applications (LDA), engineer and produces heavy-duty electrical drive systems for medium and high voltage ranges: electrical motors, converters, and generators. Additionally, we offer special large drives for ships, mines, and rolling mills. Our digitalization ...

Plugin template missing! Fix or contact support.