CRISIL Ltd

Technical Lead - Data Engineering with Databricks and PySpark

CRISIL Ltd
Mumbai/Bombay
Not disclosed
Work from OfficeWork from Office
Full TimeFull Time
Min. 10 yearsMin. 10 years

Job Description

Technical Lead – Databricks & PySpark

Department

None

Job Description

We are seeking a highly skilled Technical Lead with strong expertise in Databricks, Python, and PySpark to lead data engineering initiatives. The ideal candidate will drive the design, development, and optimization of scalable data pipelines while mentoring a team of engineers and collaborating with cross-functional stakeholders.

Key Responsibilities

  • Lead the design and development of data pipelines and ETL/ELT workflows using Databricks and PySpark
  • Architect and implement scalable, high-performance data solutions on cloud platforms (AWS/GCP)
  • Collaborate with data architects, analysts, and business teams to translate requirements into technical solutions
  • Optimize data processing jobs for performance, reliability, and cost efficiency
  • Ensure data quality, governance, and security standards are followed
  • Mentor and guide junior engineers; perform code reviews and enforce best practices
  • Drive adoption of CI/CD, DevOps, and automated testing in data engineering workflows
  • Troubleshoot and resolve production issues, ensuring high availability of data systems

Required Skills & Qualifications

  • Strong experience in Python and PySpark development
  • Hands-on expertise with Databricks (workflows, Delta Lake, notebooks, cluster management)
  • Solid understanding of data engineering concepts, distributed computing, and big data processing
  • Experience with SQL and relational/NoSQL databases
  • Expertise in data modeling, partitioning, and performance tuning
  • Proficiency with cloud platforms (AWS/GCP equivalents)
  • Familiarity with Delta Lake, streaming (Structured Streaming), and batch workloads
  • Strong knowledge of Git, CI/CD pipelines, and DevOps practices
  • Experience with workflow orchestration tools (Airflow, Temporal, etc.)

Preferred Qualifications

  • Experience with data warehousing and lakehouse architecture
  • Knowledge of ML pipelines or MLOps integration
  • Exposure to data governance tools and frameworks
  • Certification in Databricks is a plus

Leadership & Soft Skills

  • Proven experience in technical leadership and team management
  • Strong problem-solving and analytical abilities
  • Excellent communication and stakeholder management skills
  • Ability to work in an agile environment and handle multiple priorities

Key Deliverables

  • High-quality, scalable data pipelines
  • Optimized data workflows in Databricks
  • Well-documented architecture and processes
  • Mentored and productive engineering team

  

 

 

Case Study: Financial Data Engineering Solution on Databricks

Background

A financial services company processes large volumes of data from multiple systems:

  • Trade transactions (Equities, Derivatives, FX)
  • Market data feeds (real-time stock prices, indices)
  • Customer/account data (KYC, portfolios)
  • Risk and compliance data

The existing system suffers from:

  • High latency in risk reporting
  • Data inconsistency across systems
  • Lack of real-time insights
  • Scalability challenges

The company wants to implement a modern lakehouse architecture using Databricks to enable real-time risk analytics, regulatory reporting, and portfolio insights.

 

Objective

Design and build a scalable, secure, and high-performance financial data platform using Databricks and PySpark to support:

  • Near real-time trade and risk analytics
  • Regulatory reporting (e.g., daily reporting, audit trails)
  • Historical analysis for portfolio performance

 

Task Requirements

1. Data Ingestion

  • Ingest data from:
    • Trade data (batch files / APIs)
    • Real-time market feeds (Kafka/Event Hub)
    • Reference data (customer, instruments)
  • Use:
    • Databricks Auto Loader for batch ingestion
    • Structured Streaming for real-time feeds

 

2. Data Transformation

  • Perform:
    • Data cleansing (nulls, incorrect formats)
    • Trade enrichment (join with instrument & customer data)
    • Currency conversion using FX rates
  • Implement key business logic:
    • Daily P&L calculations
    • Exposure aggregation (by asset class, customer, region)
    • Risk metrics (VaR, notional exposure)

 

3. Data Storage (Lakehouse Design)

  • Implement Medallion Architecture:
    • Bronze: Raw ingested data
    • Silver: Cleaned & standardized data
    • Gold: Aggregated datasets for reporting
  • Use Delta Lake features:
    • ACID transactions
    • Time travel (for audit and compliance)
    • Schema evolution

 

 

4. Performance Optimization

  • Optimize PySpark pipelines:
    • Partitioning by trade date, asset class
    • Z-ordering on frequently queried columns (e.g., account_id)
    • Cache intermediate datasets
  • Tune cluster configurations (autoscaling, job clusters)

 

5. Data Quality & Governance

  • Implement:
    • Data validation rules (e.g., missing trade IDs, invalid prices)
    • Reconciliation checks (trade counts vs source)
  • Ensure:
    • Data lineage tracking
    • Role-based access control (RBAC)
    • Sensitive data masking (PII, financial data)

 

6. Streaming & Real-Time Processing

  • Build streaming pipelines for:
    • Real-time market data ingestion
    • Intraday risk calculations
  • Ensure:
    • Low latency processing
    • Fault-tolerant design (checkpointing, retries)

 

7. Orchestration

  • Implement pipeline orchestration using:
    • Databricks Workflows / Airflow / Azure Data Factory
  • Handle:
    • Dependencies (e.g., reference data before trade enrichment)
    • Job retries and alerts

 

8. CI/CD & Deployment

  • Use Git-based workflows:
    • Branching strategy
    • Code reviews
  • Implement CI/CD pipelines for:
    • Automated testing
    • Deployment to environments (Dev/Test/Prod)

Open Positions

1

Mandatory Skills

Pyspark,databrics,Data Engineer,Lead Data Engineer,Python

Education Qualification

Post Graduation or Graduation in Computers or it's equalent

Experience

10 to 12 years

Job role

Work location
Work locationHyderabad / Mumbai
Department
DepartmentData Science & Analytics
Role / Category
Role / CategoryData Science & Machine Learning
Employment type
Employment typeFull Time
Shift
ShiftDay Shift

Job requirements

Experience
ExperienceMin. 10 years

About company

Name
NameCRISIL Ltd
Job posted by CRISIL Ltd

Similar jobs you can apply for

Manufacturing / Production
Essence Ecocrafts

QA / QC Executive

Essence Ecocrafts
Kandivali West, Mumbai/Bombay
₹18,000 - ₹24,000
Work from Office
Full Time
Any experience
Basic English
Mahek Marketing India

Junior Software Developer

Mahek Marketing India
Bhandup West, Mumbai/Bombay
₹22,000 - ₹40,000
Work from Office
Full Time
Min. 2 years
Basic English
PG Skill Technologies Private Limited

Website Developer

PG Skill Technologies Private Limited
Malad West, Mumbai/Bombay
₹20,000 - ₹25,000
Work from Office
Full Time
Min. 2 years
Basic English
Marqetrix Web Solutions

Wordpress Developer

Marqetrix Web Solutions
Jogeshwari West, Mumbai/Bombay
₹15,000 - ₹25,000
Work from Office
Full Time
Min. 2 years
Good (Intermediate / Advanced) English
Maurya Ethnic Wear

Website Mainetenance

Maurya Ethnic Wear
Dadar West, Mumbai/Bombay
₹12,000 - ₹18,000
Work from Office
Part Time
Full Time
Min. 6 months
Basic English

Project Engineer

V Tech Technologies Private Limited
Jogeshwari East, Mumbai/Bombay
₹10,000 - ₹20,000
Field Job
Full Time
Any experience
Basic English

You can expect a minimum salary of 0 INR. The salary offered will depend on your skills, experience and performance in the interview.

The candidate should have completed the required education and people who have 10 to 12 years are eligible to apply for this job. You can apply for more jobs in Mumbai/Bombay to get hired quickly.

The candidate should have sound communication skills and sound communication skills for this job.

Both Male and Female candidates can apply for this job.

No, it's not a work from home job and can't be done online. You can explore and apply for other work from home jobs in Mumbai/Bombay at apna.

No work-related deposit needs to be made during your employment with the company.

Go to the apna app and apply for this job. Click on the apply button and call HR directly to schedule your interview.

The last date to apply for this job is . For more details, download apna app and find Full Time jobs in Mumbai/Bombay . Through apna, you can find jobs in 64 cities across India. Join NOW!