Senior Data Engineer

The Scalers

Bengaluru/Bangalore

Not disclosed

Work from Office

Full Time

Min. 5 years

Job Details

Job Description

Senior Data Engineer


As a Senior Data Engineer—back Office, you will be an integral part of the Data and DB Engineering team within our Infrastructure organization. This fast-paced, dynamic role involves designing and maintaining cutting-edge data pipelines, integrating complex data sources, and ensuring the seamless flow of data across the firm.

 

ROLES AND RESPONSIBILITIES

  • Data Pipeline Development: Design, build, and maintain robust ETL pipelines using Python, PySpark, and AWS Glue to enable seamless data ingestion and processing.
  • Data Integration: Integrate various structured and unstructured data sources, including third-party APIs and internal databases, into AWS-based data lakes and data warehouses.
  • Data Transformation: Utilize PySpark and Pandas for large-scale data transformations to prepare data for analytics and machine learning models.
  • Automation & Orchestration: Implement and maintain data workflow automation using Apache Airflow, ensuring reliable data processing and delivery.
  • Troubleshooting & Issue Resolution: Actively troubleshoot data pipeline issues, identify root causes, and implement solutions in a timely manner, ensuring data availability.
  • Optimization: Continuously optimize data pipelines for performance, ensuring scalability and efficiency in handling large datasets across the firm.
  • Collaboration: Partner with data scientists, analysts, and business stakeholders to ensure data infrastructure supports business needs and advanced analytics.
  • Ownership & Accountability: Take full ownership of the end-to-end data pipeline lifecycle, from development to production monitoring and troubleshooting.

 

KEY REQUIREMENTS

  • 5+ years of prior experience in Data Engineering.
  • 5+ years of experience with databases and data warehouses like SQL Server, PostgreSQL, MongoDB or Snowflake.
  • 3+ years of experience in Python and its libraries (e.g., Pandasfor data engineering tasks.
  • 3+ years of experience in PySpark for distributed data processing at scale.
  • 1+ year of experience with AWS cloud services, including but not limited to AWS GlueS3 and IAM.
  • Prior experience working with Airflow for automating workflows and managing ETL processes.
  • Familiarity with DevOps tools such as CI/CD pipelinesGit, and infrastructure-as-code (Terraform preferred).
  • Strong troubleshooting skills with the ability to resolve data and performance issues efficiently in a high-pressure environment.
  • Knowledge of data architecture best practices for data lakes and data warehousing.
  • Problem solver who can work independently on a task and ensure deliverables.
  • Bachelor’s or Master’s degree in Computer Science or related field.
  • Strong problem-solving skills and attention to detail.
  • Strong English communication skills, both written and spoken, are crucial.

Job role

Work location

Bangalore

Department

IT & Information Security

Role / Category

Data Science & Machine Learning

Employment type

Full Time

Shift

Day Shift

Job requirements

Experience

Min. 5 years

About company

Name

The Scalers

Job posted by The Scalers

Apply on company website