Synechron Technologies

PySpark Data Engineer with Cloudera and Cloud Expertise

Synechron Technologies
Bengaluru/Bangalore
Not disclosed
Work from OfficeWork from Office
Full TimeFull Time
Min. 5 yearsMin. 5 years

Job Description

PySpark Data Engineer with Cloudera and Cloud Expertise

Job Summary
Synechron is seeking a highly experienced PySpark Data Engineer to design, develop, and maintain scalable, high-quality data pipelines within the Cloudera Data Platform (CDP). This role is critical in ensuring reliable data ingestion, transformation, and availability for advanced business analytics, reporting, and data science initiatives. The successful candidate will bring a strong background in big data processing, data architecture, and cloud integration, contributing to data-driven decision-making and operational excellence across the organization.

Software Requirements

  • Required:

    • Advanced proficiency in PySpark, including handling RDDs, DataFrames, Spark SQL, and optimization techniques

    • Hands-on experience with Cloudera Data Platform (CDP) components such as Cloudera Manager, Hive, Impala, HDFS, and HBase

    • Working knowledge of Hadoop ecosystem, Kafka, and distributed data processing tools

    • Experience with SQL-based data warehousing tools like Hive and Impala

    • Scripting skills in Linux (Bash, Python) for automation and operational tasks

    • Familiarity with orchestration and scheduling tools such as Apache Airflow or Oozie

  • Preferred:

    • Knowledge of cloud-native data services (AWS Glue, EMR, Azure Data Factory)

    • Use of version control systems like Git and CI/CD pipelines (Jenkins, GitLab CI)

    • Experience with data modeling, data governance, and metadata management tools

Overall Responsibilities

  • Design, develop, and optimize scalable data pipelines using PySpark within the Cloudera Data Platform.

  • Manage end-to-end data ingestion processes from multiple sources (relational databases, APIs, file systems) into data lakes or warehouses.

  • Execute data transformation, cleansing, and aggregation processes supporting analytical and reporting requirements.

  • Conduct performance tuning of Spark jobs and related CDP components to ensure efficient resource utilization.

  • Implement data validation and quality checks, ensuring data accuracy and consistency through monitoring and alerting.

  • Automate data workflows using orchestration tools like Airflow or Oozie to reduce manual intervention.

  • Monitor pipeline performance, troubleshoot failures, and implement corrective actions for operational stability.

  • Collaborate with data architects, analysts, and data scientists to support large-scale analytics initiatives.

  • Document data architecture, pipeline configurations, and operational procedures for ongoing maintenance and governance.

  • Lead data architecture discussions supporting data privacy, security, and compliance standards.

Technical Skills (By Category)

  • Programming Languages (Essential):

    • Python (especially PySpark)

    • SQL for data extraction, validation, and analysis

  • Big Data & Data Management (Essential):

    • Spark (PySpark), Hadoop ecosystem, HDFS, Hive, Impala, HBase

    • Data ingestion and transformation in large distributed environments

  • Cloud & Platform Technologies (Preferred):

    • Cloud-native data processing (AWS EMR, Azure HDInsight, GCP Dataproc)

  • Frameworks & Libraries (Essential):

    • Spark SQL, Spark Streaming

    • Data modeling and governance tools (preferred: Apache Atlas or Collibra)

  • Orchestration & Automation (Preferred):

    • Airflow, Oozie, Jenkins

  • Security & Data Governance (Preferred):

    • Data masking, encryption, access control in distributed systems

Experience Requirements

  • Minimum of 5+ years as a Data Engineer with deep expertise in PySpark and big data processing

  • Proven experience designing, implementing, and maintaining scalable data pipelines in enterprise environments

  • Strong background with Cloudera Data Platform (CDP) components such as Hive, Impala, HDFS, and HBase

  • Demonstrated ability to optimize Spark jobs and manage high-volume data workflows

  • Support experience in cloud environments (AWS, Azure, or GCP) for data processing is advantageous

  • Industry experience supporting financial services, banking, or highly regulated sectors is a plus

  • Alternative pathways include extensive hands-on Big Data processing experience in data-centric roles with demonstrated expertise in performance tuning and operational stability

Day-to-Day Activities

  • Develop and optimize Spark (PySpark) data pipelines for ingesting, transforming, and publishing data in large distributed systems

  • Monitor data workflows and troubleshoot issues proactively to maintain pipeline health.

  • Collaborate with data scientists, analysts, and platform teams to meet data quality, security, and governance standards.

  • Automate operational workflows, including job scheduling, alerting, and resource management.

  • Perform performance tuning of Spark jobs and related components to optimize runtime and resource efficiency.

  • Conduct data validation, anomaly detection, and data quality assessments.

  • Document architecture, data flows, and operational procedures for compliance and knowledge sharing.

  • Support ongoing system upgrades, data privacy initiatives, and cloud migration efforts.

Qualifications

  • Bachelor’s or Master’s degree in Computer Science, Data Engineering, Information Systems, or equivalent

  • 5+ years of hands-on experience in data engineering, with an emphasis on PySpark and big data systems

  • Proven expertise in designing scalable, high-performance data pipelines in enterprise environments

  • Hands-on experience with Cloudera Data Platform (CDP), Hadoop, Hive, Impala, and HBase

  • Strong SQL and data modeling skills within distributed data architectures

  • Experience with cloud data services is a plus

  • Relevant certifications (e.g., AWS Data Analytics Specialty, GCP Professional Data Engineer) are advantageous

  • Strong analytical, troubleshooting, and communication skills

Professional Competencies

  • Critical thinking and analytical mindset for complex data workflows and problem resolution

  • Ability to manage multiple priorities and deliver results in a fast-paced environment

  • Effective collaboration skills for cross-team data initiatives and stakeholder engagement

  • Innovation-driven approach for optimizing and automating data processes

  • Ownership mindset to ensure operational stability and data quality standards

  • Adaptability and continuous learner to keep pace with evolving big data and cloud technologies

S​YNECHRON’S DIVERSITY & INCLUSION STATEMENT

Diversity & Inclusion are fundamental to our culture, and Synechron is proud to be an equal opportunity workplace and is an affirmative action employer. Our Diversity, Equity, and Inclusion (DEI) initiative ‘Same Difference’ is committed to fostering an inclusive culture – promoting equality, diversity and an environment that is respectful to all. We strongly believe that a diverse workforce helps build stronger, successful businesses as a global company. We encourage applicants from across diverse backgrounds, race, ethnicities, religion, age, marital status, gender, sexual orientations, or disabilities to apply. We empower our global workforce by offering flexible workplace arrangements, mentoring, internal mobility, learning and development programs, and more.


All employment decisions at Synechron are based on business needs, job requirements and individual qualifications, without regard to the applicant’s gender, gender identity, sexual orientation, race, ethnicity, disabled or veteran status, or any other characteristic protected by law.

Candidate Application Notice

Experience Level

Senior Level

Job role

Work location
Work locationBengaluru - BCIT, India
Department
DepartmentData Science & Analytics
Role / Category
Role / CategoryDBA / Data warehousing
Employment type
Employment typeFull Time
Shift
ShiftDay Shift

Job requirements

Experience
ExperienceMin. 5 years

About company

Name
NameSynechron Technologies
Job posted by Synechron Technologies

Similar jobs you can apply for

Software / Web Developer
Minchu Productions

App Developer

Minchu Productions
Jaya Nagar, Bengaluru/Bangalore
₹25,000 - ₹25,000
Work from Office
Full Time
Any experience
Good (Intermediate / Advanced) English
Jai Finance India Limited

Quality Assurance Officer

Jai Finance India Limited
BTM Layout, Bengaluru/Bangalore
₹25,000 - ₹30,000
Work from Office
Full Time
Min. 1 year
Good (Intermediate / Advanced) English

QA / QC Executive

Sidra Tech Solutions
HSR Layout, Bengaluru/Bangalore
₹25,000 - ₹25,000
Work from Office
Full Time
Min. 1 year
Good (Intermediate / Advanced) English
Kateel Engineering Industry Private Limited

Quality Assurance Engineer

Kateel Engineering Industry Private Limited
Kamaksipalya, Bengaluru/Bangalore
₹17,000 - ₹20,000
Work from Office
Full Time
Any experience
Good (Intermediate / Advanced) English
Smart Detective & Allied Services (India) Private Limited

Database Analyst

Smart Detective & Allied Services (India) Private Limited
Halasuru, Bengaluru/Bangalore
₹21,100 - ₹21,100
Work from Office
Full Time
Freshers only
Good (Intermediate / Advanced) English
360 Bytes Tech Venture Private Limited

Package Consultant – SAP HANA SCM PM

360 Bytes Tech Venture Private Limited
Bengaluru/Bangalore
₹1,00,000 - ₹1,15,000
Work from Office
Full Time
Min. 10 years
Good (Intermediate / Advanced) English

You can expect a minimum salary of 0 INR. The salary offered will depend on your skills, experience and performance in the interview.

The candidate should have completed the required education and people who have 5 to 31 years are eligible to apply for this job. You can apply for more jobs in Bengaluru/Bangalore to get hired quickly.

The candidate should have sound communication skills and sound communication skills for this job.

Both Male and Female candidates can apply for this job.

No, it's not a work from home job and can't be done online. You can explore and apply for other work from home jobs in Bengaluru/Bangalore at apna.

No work-related deposit needs to be made during your employment with the company.

Go to the apna app and apply for this job. Click on the apply button and call HR directly to schedule your interview.

The last date to apply for this job is . For more details, download apna app and find Full Time jobs in Bengaluru/Bangalore . Through apna, you can find jobs in 64 cities across India. Join NOW!