Big Data Engineer - PySpark & Hadoop Specialist
Synechron TechnologiesJob Description
Big Data Engineer | PySpark, Hadoop Ecosystem, Cloud Integration & Data Migration
Job Summary
Synechron is seeking a seasoned Big Data Engineer specialized in PySpark to support complex data processing and ETL workflows within enterprise environments. The role involves designing, developing, and optimizing scalable data pipelines supporting analytics, data migration, and high-volume processing needs. The candidate will leverage their expertise in Hadoop ecosystem components, distributed computing, and storage formats to deliver high-performance, maintainable solutions aligned with business and regulatory requirements.
Software Requirements
Required Software Proficiency:
SQL (T-SQL, HiveQL, or ANSI SQL) — strong skills supporting data validation, query optimization, and data management (4+ years)
Hadoop Ecosystem: HDFS, Hive, Pig, Sqoop, Spark, or Impala — extensive experience supporting large-scale data processing and pipeline development (4+ years)
Data Ingestion and ETL tools supporting enterprise workflows — proven ability to develop and optimize data pipelines (4+ years)
Distributed computing concepts (MapReduce, Spark) supporting high-volume data processing
Knowledge of file formats: Parquet, ORC, Avro, JSON, CSV — supporting data storage and retrieval efficiency
Performance tuning for queries and data pipelines supporting operational and analytical workloads
Scripting skills: Python, Shell, or Scala support automation and pipeline scripting (preferred)
Preferred Software Skills:
Cloud data platforms (Azure, AWS, GCP) supporting scalable data processing (supporting deployment, storage, and processing)
Data workflow orchestration tools supporting automation of data pipelines (e.g., Apache Airflow, Oozie)
Overall Responsibilities
Design, develop, and optimize scalable data pipelines supporting analytics, migration, and operational reporting
Build high-performance ETL workflows using PySpark, Spark SQL, and Hadoop ecosystem components
Support data ingestion, transformation, and validation activities ensuring data quality and consistency
Collaborate with data science, data engineering, and business teams to translate requirements into technical solutions
Tune performance of data queries, Spark jobs, and storage formats to support high-volume workloads
Implement data governance, security, and compliance practices supporting industry standards and regulations
Maintain operational documentation, data lineage, and best practices for pipeline management
Lead efforts to improve automation, pipeline reliability, and system scalability supporting enterprise growth
Technical Skills (By Category)
Languages & Data Tools (Essential):
Python, Spark SQL, HiveQL, or ANSI SQL supporting scalable data transformations and queries
Hadoop ecosystem components: HDFS, Hive, Pig, Sqoop, Impala supporting large-scale data pipelines
Databases & Data Management:
Relational: SQL Server, Oracle, PostgreSQL support for transactional and reference data validation
Data storage formats: Parquet, ORC, Avro support efficient data management and retrieval
Cloud & Infrastructure:
Support for cloud platforms (Azure, AWS, GCP) supporting scalable storage and processing (preferred)
Data orchestration tools supporting automation (e.g., Airflow, Oozie) (preferred)
Frameworks & Libraries:
PySpark, Spark SQL support for large-scale data transformation and processing
Tools & Methodologies:
ETL/ELT development, workflow automation, performance tuning practices supporting agile environments
Security & Governance:
Data masking, encryption, and access controls aligned with compliance standards (HIPAA, GDPR) support
Experience Requirements
4+ years of experience supporting large-scale data processing, data pipelines, and ETL workflows in enterprise environments
Proven expertise in Hadoop ecosystem components, Spark, and distributed data processing support
Experience in data validation, reconciliation, and storage optimization supporting analytics and migration
Knowledge in supporting regulated environments with compliance, security, and data governance standards (preferred)
Alternative pathways include extensive experience in data engineering, supporting high-volume data systems, and automation
Day-to-Day Activities
Develop, test, and optimize data pipelines using PySpark, Hive, and Hadoop ecosystem components
Support data ingestion, transformation, and validation supporting business analytics and migration projects
Monitor system performance, troubleshoot data processing issues, and implement optimizations
Collaborate with data analysts, data scientists, and enterprise data teams on technical solutions
Support cloud or on-premises data warehouse environments supporting enterprise analytics
Implement and support data governance practices, security controls, and compliance measures
Maintain detailed documentation supporting operational procedures, data flows, and data lineage
Automate workflows and iteratively improve pipeline reliability and performance
Qualifications
Bachelor’s or Master’s degree in Data Engineering, Computer Science, or a related field
4+ years supporting big data solutions, ETL workflows, and data migration in enterprise settings
Experience with Hadoop ecosystem, Spark, and distributed data processing platforms
Support for cloud data services supporting large-scale, high-volume workloads (preferred)
Certifications in Hadoop, Spark, or cloud platforms (e.g., AWS, GCP, Azure) are a plus
Professional Competencies
Strong analytical and troubleshooting skills supporting complex data workflows
Leadership skills to guide junior team members and promote best practices in data engineering
Excellent communication for stakeholder engagement, documentation, and reporting
Adaptability to evolving data standards, tools, and regulatory frameworks
Commitment to data quality, security, and operational efficiency
Time management and organizational skills for handling multiple data projects in a fast-paced environment
SYNECHRON’S DIVERSITY & INCLUSION STATEMENT
Diversity & Inclusion are fundamental to our culture, and Synechron is proud to be an equal opportunity workplace and is an affirmative action employer. Our Diversity, Equity, and Inclusion (DEI) initiative ‘Same Difference’ is committed to fostering an inclusive culture – promoting equality, diversity and an environment that is respectful to all. We strongly believe that a diverse workforce helps build stronger, successful businesses as a global company. We encourage applicants from across diverse backgrounds, race, ethnicities, religion, age, marital status, gender, sexual orientations, or disabilities to apply. We empower our global workforce by offering flexible workplace arrangements, mentoring, internal mobility, learning and development programs, and more.
All employment decisions at Synechron are based on business needs, job requirements and individual qualifications, without regard to the applicant’s gender, gender identity, sexual orientation, race, ethnicity, disabled or veteran status, or any other characteristic protected by law.
Experience Level
Senior LevelJob role
Job requirements
About company
Similar jobs you can apply for
Software / Web DeveloperWeb Developer
Big Infotech Pvt. Ltd.
AI/ML Engineer & Application Developer
Sahil Digital Services
Business Interns
SK4All Bizserve
QC Specialist
Team ASK Engineers Private Limited
Quality Control
Mist Ressonance Engineering Private Limited
Quality Engineer
Vijaya Management ServicesYou can expect a minimum salary of 0 INR. The salary offered will depend on your skills, experience and performance in the interview.
The candidate should have completed the required education and people who have 4 to 31 years are eligible to apply for this job. You can apply for more jobs in Pune to get hired quickly.
The candidate should have sound communication skills and sound communication skills for this job.
Both Male and Female candidates can apply for this job.
No, it's not a work from home job and can't be done online. You can explore and apply for other work from home jobs in Pune at apna.
No work-related deposit needs to be made during your employment with the company.
Go to the apna app and apply for this job. Click on the apply button and call HR directly to schedule your interview.
The last date to apply for this job is . For more details, download apna app and find Full Time jobs in Pune . Through apna, you can find jobs in 64 cities across India. Join NOW!