AI Data Engineer
Hewlett Packard Enterprise India Private LimitedJob Description
AI Data Engineer
AI Data EngineerThis role has been designed as ‘’Onsite’ with an expectation that you will primarily work from an HPE office.
Who We Are:
Hewlett Packard Enterprise is the global edge-to-cloud company advancing the way people live and work. We help companies connect, protect, analyze, and act on their data and applications wherever they live, from edge to cloud, so they can turn insights into outcomes at the speed required to thrive in today’s complex world. Our culture thrives on finding new and better ways to accelerate what’s next. We know varied backgrounds are valued and succeed here. We have the flexibility to manage our work and personal needs. We make bold moves, together, and are a force for good. If you are looking to stretch and grow your career our culture will embrace you. Open up opportunities with HPE.
Job Description:
HPE Financial services is where we help organizations create the investment they need for digital transformation, in an innovative and sustainable way. We partner with customers across their entire IT asset portfolio from edge to cloud to end-user. Unique to each client’s aspirations and size, our financial and asset management solutions are anchored by best-in-class tech upcycling services. Join us redefine what’s next for you.
Role summary
We are looking for a technically sharp and detail-oriented Data Engineer to join HPEFS (Hewlett Packard Enterprises Financial Services - Advanced Analytics & BI team Bangalore. This role is the data backbone that powers our AI capabilities — working in close partnership with the AI Engineers to ensure that the data flowing into AI models, dashboards, and business workflows is clean, governed, and well-structured. This role will play a hands on role and own the backend data lifecycle: ingesting raw data from diverse sources, transforming it into reliable, analysis-ready datasets, enforcing data quality standards, and publishing governed data products via Microsoft Fabric and Databricks. You will also support reporting needs through Power BI and contribute to Collibra-based data governance initiatives. A working familiarity with Microsoft Copilot and AI-assisted data tooling is expected
What you'll do:
Data Engineering & Transformation
Design, build, and maintain scalable ETL/ELT pipelines using Azure Data Factory, Databricks (PySpark / Delta Live Tables), and Microsoft Fabric Data Factory.
Transform raw, multi-source data into clean, conformed, and analytics-ready datasets following Medallion Architecture principles (Bronze → Silver → Gold).
Develop and optimize SQL and PySpark-based transformation logic for structured, semi-structured, and unstructured data.
Implement incremental load patterns, merge/upsert logic, and slowly changing dimension (SCD) strategies to support historical data tracking.
Collaborate with the AI Engineers to prepare high-quality feature datasets for ML and LLM use cases.
Data Quality & Governance
Define, implement, and monitor data quality rules including completeness, accuracy, consistency, timeliness, and uniqueness checks.
Administer and extend the Collibra data governance platform — including business glossary management, data lineage documentation, and stewardship workflows.
Build automated data quality validation frameworks using tools such as Great Expectations, dbt tests, or Unity Catalog data quality constraints in Databricks.
Triage and resolve data quality incidents, root-cause data anomalies, and communicate impact to stakeholders proactively.
Maintain metadata catalogues and ensure all critical datasets have documented ownership, lineage, and classification.
Microsoft Fabric & Lakehouse
Build and manage Lakehouses, Warehouses, and Dataflows Gen2 within the Microsoft Fabric ecosystem.
Configure OneLake, shortcuts, and mirroring to unify data across sources without unnecessary duplication.
Leverage Fabric Notebooks (PySpark / Python) and Spark job definitions for large-scale data processing.
Support the semantic model layer in Fabric to ensure Power BI datasets are optimized and governed
Power BI & Reporting
Develop and maintain Power BI semantic models (star schema design, DAX measures, row-level security).
Build production-grade dashboards and reports for business stakeholders; ensure refresh reliability and performance.
Apply Copilot-assisted authoring in Power BI and Fabric where applicable to accelerate report generation.
Support self-service analytics adoption by publishing governed datasets to the Power BI service
Collaboration & AI Enablement
Partner closely with the AI Engineers, peer data scientist and analytics team members to supply clean, structured data for RAG pipelines, model training, and agentic workflows.
Contribute to the design of shared data contracts and API schemas between data engineering and AI engineering layers.
Assist with AI-assisted data tasks using Microsoft Copilot (in Fabric, Power BI, and Azure environments).
What you need to bring:
Qualifications
Bachelor's or Master's degree in Computer Science, Information Systems, Data Engineering, Mathematics, or a related discipline.
4 – 5 years of hands-on experience in data engineering, ETL development, or analytics engineering roles.
Demonstrable experience with Databricks and/or Microsoft Fabric in a production environment.
Proficiency in Power BI report and semantic model development.
Exposure to Collibra or equivalent data governance / cataloguing platforms is strongly preferred.
Strong SQL and Python skills; PySpark experience is required.
Familiarity with Azure cloud services and DevOps practices for data pipeline deployment
Technical Skill Requirements
Data Platforms - Databricks (PySpark, Delta Lake, Delta Live Tables, Unity Catalog), Microsoft Fabric (Lakehouse, Warehouse, Dataflows Gen2, Notebooks), Azure Data Lake Storage Gen2
Data Transformation - PySpark, SQL, dbt (data build tool), Azure Data Factory, Fabric Data Factory; Medallion Architecture, SCD types, incremental load patterns
Data Modelling - Star schema, snowflake schema, dimensional modelling, data vault concepts; normalization, entity-relationship design, semantic layer design
Reporting & BI - Power BI (DAX, semantic models, RLS, Power Query / M), Microsoft Fabric Power BI integration, Copilot-assisted authoring in Power BI
Programming - Python (primary), SQL (advanced); PySpark; familiarity with JSON, Parquet, Delta file formats
Cloud & DevOps - Azure (preferred): Synapse, ADF, ADLS Gen2, Key Vault; Git/GitHub for version control; CI/CD basics for pipeline deployment
Data Governance & Cataloguing - data lineage documentation, metadata management, data classification and tagging, business glossary ownership
AI & Copilot Tooling - Microsoft Copilot in Fabric / Power BI; familiarity with AI-assisted data transformation; understanding of LLM data requirements (embeddings, chunking, vector-ready formats)
Data Concepts - Data warehousing, lakehouse architecture, OLAP vs OLTP, event-driven ingestion, streaming basics (Structured Streaming / Event Hubs), data contracts, master data management (MDM)
#Financialservices
Additional Skills:
Accountability, Accountability, Action Planning, Active Learning, Active Listening, Agile Methodology, Agile Scrum Development, Analytical Thinking, Bias, Coaching, Creativity, Critical Thinking, Cross-Functional Teamwork, Data Analysis Management, Data Collection Management (Inactive), Data Controls, Design, Design Thinking, Empathy, Follow-Through, Group Problem Solving, Growth Mindset, Intellectual Curiosity (Inactive), Long Term Planning, Managing Ambiguity {+ 5 more}What We Can Offer You:
Health & Wellbeing
We strive to provide our team members and their loved ones with a comprehensive suite of benefits that supports their physical, financial and emotional wellbeing.
Personal & Professional Development
We also invest in your career because the better you are, the better we all are. We have specific programs catered to helping you reach any career goals you have — whether you want to become a knowledge expert in your field or apply your skills to another division.
Unconditional Inclusion
We are unconditionally inclusive in the way we work and celebrate individual uniqueness. We know varied backgrounds are valued and succeed here. We have the flexibility to manage our work and personal needs. We make bold moves, together, and are a force for good.
Let's Stay Connected:
Follow @HPECareers on Instagram to see the latest on people, culture and tech at HPE.
#indiaJob:
EngineeringJob Level:
TCP_03
HPE is an Equal Employment Opportunity/ Veterans/Disabled/LGBT employer. We do not discriminate on the basis of race, gender, or any other protected category, and all decisions we make are made on the basis of qualifications, merit, and business need. Our goal is to be one global team that is representative of our customers, in an inclusive environment where we can continue to innovate and grow together. Please click here: Equal Employment Opportunity.
Hewlett Packard Enterprise is EEO Protected Veteran/ Individual with Disabilities.
HPE will comply with all applicable laws related to employer use of arrest and conviction records, including laws requiring employers to consider for employment qualified applicants with criminal histories.
No Fees Notice & Recruitment Fraud Disclaimer
It has come to HPE’s attention that there has been an increase in recruitment fraud whereby scammer impersonate HPE or HPE-authorized recruiting agencies and offer fake employment opportunities to candidates. These scammers often seek to obtain personal information or money from candidates.
Please note that Hewlett Packard Enterprise (HPE), its direct and indirect subsidiaries and affiliated companies, and its authorized recruitment agencies/vendors will never charge any candidate a registration fee, hiring fee, or any other fee in connection with its recruitment and hiring process. The credentials of any hiring agency that claims to be working with HPE for recruitment of talent should be verified by candidates and candidates shall be solely responsible to conduct such verification. Any candidate/individual who relies on the erroneous representations made by fraudulent employment agencies does so at their own risk, and HPE disclaims liability for any damages or claims that may result from any such communication.
Experience Level
Mid LevelJob role
Job requirements
About company
Similar jobs you can apply for
Logistics/ Warehouse operations
Quality Assistant
Packaid Ecopack Private LimitedQuality Control Engineer
Pragathi IT SolutionsSoftware Developer
Frame Culture Private LimitedAudio Evaluator — Speech Quality Assessment (Hindi/Tamil/Telugu/Urdu/Odia)
Arctic Engines