Junior Data Engineer - GCP
Ford MotorJob Description
Data Engineer
Responsibilities
• Design and build GCP data driven solutions for enterprise data warehouse and data lakes
• Design and implement scalable data architectures on GCP, including data lakes, warehouses, and real-time processing systems.
• Utilize ML Services like Vertex AI.
• Develop and optimize ETL/ELT pipelines using tools like Python, SQL, and streaming technologies (e.g., Kafka, Apache Airflow).
• Architect Historical and Incremental Loads and Refine Architecture on an ongoing basis
• Manage and optimize data storage, partitioning, and clustering strategies for high performance and reliability, utilizing services such as BigQuery, Spark, Pub/Sub, and Object Storage.
• Collaborate with data scientists, Data engineers, and other stakeholders to understand data needs and deliver solutions aligned with business objectives, security, and data governance.
• Automate infrastructure and deployments using Infrastructure as Code (IaC) using tools like Terraform and CI/CD practices (e.g., Tekton) to ensure reliability and scalability.
• Operationalize machine learning models by building data infrastructure and managing structured and unstructured data, supporting AI/ML/LLM workflows, including data labeling, classification, and document parsing.
• Monitor and troubleshoot data pipelines and systems to identify and resolve issues related to performance, reliability, and cost-effectiveness.
• Document data processes, pipeline designs, and architecture, contributing to knowledge transfer and system maintenance.
Qualifications and Skills
- Must have -
- Professional GCP - Data Engineer Certification.
- 2 + years coding skills in Java/Python and Terraform.
- 2+ years’ experience
- Experience in working with Agile and Lean methodologies.
- GCP Expertise: Strong proficiency in GCP services, including BigQuery, Dataflow, Dataproc, Data Fusion, Air Flow, Pub/Sub, Cloud Storage, Vertex AI, Cloud Functions, and Cloud Composer, GCP based Big Data deployments (Batch/Real-Time) leveraging Big Query, Big Table
- Programming & Scripting: Expert-level skills in Python and SQL are essential. Familiarity with languages like Scala or Java can also be beneficial, especially for working with tools like Apache Spark.
- Data Engineering Fundamentals: Solid understanding of data modeling, data warehousing concepts, ETL/ELT processes, and big data architecture, Designing pipelines and architectures for data processing.
- Big Data Technologies: Experience with technologies like Apache Spark, Apache Beam, and Kafka is often required.
- DevOps & MLOps: Knowledge of DevOps methodologies, CI/CD pipelines, and MLOps practices, including integrating data pipelines with ML workflows.
- Security & Compliance: Expertise in implementing Identity and Access Management (IAM) policies, ensuring data encryption, and adhering to data privacy regulations.
- Analytical & Problem-Solving Skills: Demonstrated ability to analyze complex datasets, identify trends, debug issues, and optimize systems for performance and cost efficiency.
- Communication & Collaboration: Excellent communication and teamwork skills, with the ability to collaborate effectively with technical and non-technical stakeholders in agile environments.
- Experienced in Visualization Tool – Qlik, Looker Studio, Power BI
- Knowledge of VBA, will be a plus
Experience Level
Mid LevelJob role
Job requirements
About company
Similar jobs you can apply for
Technician
Technical Assistant
Dewetron Technology India Private LimitedQuality Control Engineer
Nixon EngineeringDesign Engineer
Exeed Engineers IndiaResident Engineer (Software)
Maswer Automotive India Private Limited
Business Coordinator
Seamless Speciality Foods and Beverages Private Limited