Senior Data Engineer
The Scalers
Apply on company website
Senior Data Engineer
The Scalers
Bengaluru/Bangalore
Not disclosed
Job Details
Job Description
Senior Data Engineer
As a Senior Data Engineer—back Office, you will be an integral part of the Data and DB Engineering team within our Infrastructure organization. This fast-paced, dynamic role involves designing and maintaining cutting-edge data pipelines, integrating complex data sources, and ensuring the seamless flow of data across the firm.
ROLES AND RESPONSIBILITIES
- Data Pipeline Development: Design, build, and maintain robust ETL pipelines using Python, PySpark, and AWS Glue to enable seamless data ingestion and processing.
- Data Integration: Integrate various structured and unstructured data sources, including third-party APIs and internal databases, into AWS-based data lakes and data warehouses.
- Data Transformation: Utilize PySpark and Pandas for large-scale data transformations to prepare data for analytics and machine learning models.
- Automation & Orchestration: Implement and maintain data workflow automation using Apache Airflow, ensuring reliable data processing and delivery.
- Troubleshooting & Issue Resolution: Actively troubleshoot data pipeline issues, identify root causes, and implement solutions in a timely manner, ensuring data availability.
- Optimization: Continuously optimize data pipelines for performance, ensuring scalability and efficiency in handling large datasets across the firm.
- Collaboration: Partner with data scientists, analysts, and business stakeholders to ensure data infrastructure supports business needs and advanced analytics.
- Ownership & Accountability: Take full ownership of the end-to-end data pipeline lifecycle, from development to production monitoring and troubleshooting.
KEY REQUIREMENTS
- 5+ years of prior experience in Data Engineering.
- 5+ years of experience with databases and data warehouses like SQL Server, PostgreSQL, MongoDB or Snowflake.
- 3+ years of experience in Python and its libraries (e.g., Pandas) for data engineering tasks.
- 3+ years of experience in PySpark for distributed data processing at scale.
- 1+ year of experience with AWS cloud services, including but not limited to AWS Glue, S3 and IAM.
- Prior experience working with Airflow for automating workflows and managing ETL processes.
- Familiarity with DevOps tools such as CI/CD pipelines, Git, and infrastructure-as-code (Terraform preferred).
- Strong troubleshooting skills with the ability to resolve data and performance issues efficiently in a high-pressure environment.
- Knowledge of data architecture best practices for data lakes and data warehousing.
- Problem solver who can work independently on a task and ensure deliverables.
- Bachelor’s or Master’s degree in Computer Science or related field.
- Strong problem-solving skills and attention to detail.
- Strong English communication skills, both written and spoken, are crucial.
Job role
Work location
Bangalore
Department
IT & Information Security
Role / Category
Data Science & Machine Learning
Employment type
Full Time
Shift
Day Shift
Job requirements
Experience
Min. 5 years
About company
Name
The Scalers
Job posted by The Scalers
Apply on company website