Data Engineer | Hinduja Tech

Job Title: Data Engineer

Qualification: B.E/B.Tech, M.E/ M.Tech

Experience: 5 to 10 Years

Job Post Date: 30/03/2026

Job Expiry Date: 30/04/2026

Location: Pune

Job Description:

Data Engineer (5+ Years Experience) – Heavy Data Analytics Project Tech Stack: Spark, PySpark, Scala, Python, SQL, Databricks, Data Lake, Data Warehouse, Snowflake, Azure (ADF/Synapse/ADLS) About the Role

We are hiring a Data Engineer with strong hands-on experience in building high‑performance data pipelines for a heavy data analytics project. The candidate must be excellent at writing complex aggregations, understanding business processes and analytical requirements, and designing scalable data lake and data warehouse solutions. Experience across multiple data platforms (Databricks, Snowflake, Azure Data Factory, Synapse, etc.) is a strong advantage. Key Responsibilities

Develop, optimize, and productionize Spark (PySpark/Scala) pipelines.
Ingest, transform, cleanse, and aggregate large datasets from varied sources.
Implement scalable ETL/ELT logic for batch and near-real-time pipelines.
Apply best practices in partitioning, caching, Delta Lake optimization, and performance tuning.

Write complex aggregation logic (window functions, rollups, grouping sets, analytical functions).
Understand business KPIs, metrics, and analytical use cases.
Translate business needs into technical transformations and data models.
Validate data outputs against business logic and analytics expectations.
Collaborate with analysts on calculations: weekly/monthly aggregates, trend lines, performance metrics, dimensional rollups.
Ensure accuracy, consistency, and traceability of business-critical metrics.

Build and maintain multi-layer Data Lake architectures (Bronze/Silver/Gold).
Work with Parquet, Delta Lake, ORC, and columnar storage formats.
Implement schema evolution, auditing, and metadata strategies.

Design dimensional models: Star Schema and Snowflake Schema.
Build fact and dimension tables supporting analytics and reporting.
Optimize table structures, keys, and partitioning strategies.

Develop notebooks/jobs using PySpark/Scala.
Manage clusters, workflows, and Delta Live Tables.
Implement best practices for performance and cost efficiency.

Strong command of SQL for aggregations, analytical functions, joins, profiling, and validation.
Write and optimize complex queries supporting dashboards, metrics, and reports.

7. Cloud Data Platforms Azure: Data Factory, Synapse Analytics, ADLS Gen2, Azure Functions (optional). Snowflake: Virtual Warehouses, Snowpipe, Streams & Tasks, performance tuning.

Validate transformation logic against business rules.
Document data flows, transformation rules, aggregation logic, and data dictionary/metadata.
Work with QA and analysts to ensure outputs match business expectations.

Required Qualifications

5+ years of hands-on data engineering experience.
Strong programming skills: Spark, Scala, Python.
Strong SQL skills (aggregations, analytical functions, large joins).
Experience with Data Lake and Data Warehouse concepts.
Experience with Spark-based processing (delta optimization, shuffle tuning, partitioning).
Experience with at least one cloud data ecosystem (Azure/AWS/GCP).

Preferred Skills

Experience with Databricks (highly desirable).
Experience with Snowflake or modern cloud DWH.
Experience with ADF/Synapse/Airflow/dbt for orchestration.
Knowledge of CI/CD for data pipelines.
Experience with large-scale data analytics environments.

Soft Skills

Strong understanding of business logic behind analytics outputs.
Ability to translate business metrics into technical transformations. Strong problem-solving and debugging skills.
Good communication and cross-team collaboration.

Apply now

Dark Mode
Like a lightning in the dark sky, our website shines bright even on a dark night.

Brightness Control
Bright or Dark, the control is yours, Listen to your eyes, for it tires a lot.

Hibernate Mode
Worry not of the energy consumed, when the Polar Bear mode is all for you.

Eng FR DE JA ES