Morningstar India Pvt Ltd

Machine Learning Engineer

Morningstar India Pvt Ltd
Mumbai/Bombay
Not disclosed
Work from OfficeWork from Office
Full TimeFull Time
Min. 2 yearsMin. 2 years

Job Description

Machine Learning Engineer

As a Machine Learning Engineer (MLE) on the AI & ML (Data Collection & Enrichment) team, you will play a critical role in building intelligent systems that acquire, process, and enrich PitchBook’s structured and unstructured data at scale. Your work will directly impact the quality, coverage, and usability of the data that powers downstream analytics, insights, and customer-facing features.

This role requires deep expertise in machine learning, data engineering, and natural language processing (NLP), with a strong emphasis on extracting, structuring, and augmenting data from diverse sources such as reports, filings, news, and web content.

You will design and deploy ML-driven pipelines for entity extraction, entity resolution, classification, and data augmentation, leveraging techniques from NLP, large language models (LLMs), and generative AI. You will be responsible for the full lifecycle of these systems—from data ingestion and model development to deployment, monitoring, and continuous improvement.

Your contributions will ensure that PitchBook maintains high-quality, comprehensive, and timely datasets by transforming raw information into structured, enriched, and reliable data assets.

You will be part of a team of machine learning engineers focused on building scalable systems for data acquisition, extraction, normalization, and enrichment. The team enables high-quality datasets that power critical features across the PitchBook Platform.

You will collaborate closely with data collection teams, platform engineers, and product stakeholders to ensure that data pipelines are robust, efficient, and aligned with business priorities.

Primary Job Responsibilities:

  • Design and build ML-driven data pipelines that ingest and process structured and unstructured data from multiple sources.
  • Develop models for information extraction, entity recognition (NER), entity resolution, classification, and data normalization.
  • Apply NLP, transformer models, and LLMs to extract and enrich data from documents such as reports, filings, and news articles.
  • Build systems that improve data coverage, accuracy, freshness, and consistency across datasets.
  • Integrate ML models into scalable production systems with strong reliability, latency, and throughput guarantees.
  • Collaborate with data collection and curation teams to incorporate human-in-the-loop feedback and improve model performance.
  • Design evaluation frameworks and metrics for data quality, extraction accuracy, and enrichment effectiveness.
  • Optimize pipelines for large-scale processing using distributed systems and streaming technologies.
  • Contribute to architecture decisions for data infrastructure, ensuring scalability and maintainability.
  • Stay current with advancements in NLP, GenAI, and information extraction, and translate research into production-ready systems.
  • Ensure best practices in monitoring, observability, data governance, and responsible AI usage.
  • Mentor junior engineers and contribute to a culture of technical excellence through reviews and knowledge sharing.

Skills & Qualifications:

  • Bachelor’s (or higher) in Computer Science, Data Science, Mathematics, or a related field.
  • 2+ years of experience in ML engineering, data engineering, or applied AI roles focused on data extraction, enrichment, or processing pipelines.
  • Strong experience in NLP, including NER, parsing, classification, and transformer-based models.
  • Hands-on experience with LLMs / GenAI for structured data extraction, augmentation, or labeling workflows.
  • Preferred experience building data pipelines and distributed systems (e.g., Kafka, Airflow, Spark, Snowflake).
  • Proficiency in Python and SQL with experience using ML frameworks such as PyTorch, TensorFlow, scikit-learn.
  • Preferred experience deploying ML systems in production, including monitoring and iteration loops.
  • Familiarity with LangChain ecosystem (LangSmith, LangGraph) or similar orchestration tools is a plus.
  • Experience with entity resolution, knowledge graphs, or data deduplication systems is desirable.
  • Strong problem-solving skills and ability to work on ambiguous data challenges.
  • Experience collaborating cross-functionally with engineering, product, and data teams.
  • Prior exposure to financial datasets or fintech ecosystems is a plus.
  • Research experience or publications in NLP/ML conferences (e.g., ACL, EMNLP, NeurIPS) is a strong plus.

Working Conditions       

The job conditions for this position are in a standard office setting. Employees in this position use PC and phones on an ongoing basis throughout the day. Limited corporate travel may be required to remote offices or other business meetings and events.

Morningstar's hybrid work environment gives you the opportunity to collaborate in-person each week as we've found that we're at our best when we're purposely together on a regular basis. In most of our locations, our hybrid work model is four days in-office each week. A range of other benefits are also available to enhance flexibility as needs change. No matter where you are, you'll have tools and resources to engage meaningfully with your global colleagues.

I10_MstarIndiaPvtLtd Morningstar India Private Ltd. (Delhi) Legal Entity

Experience Level

Mid Level

Job role

Work location
Work locationMumbai, India
Department
DepartmentData Science & Analytics
Role / Category
Role / CategoryData Science & Machine Learning
Employment type
Employment typeFull Time
Shift
ShiftDay Shift

Job requirements

Experience
ExperienceMin. 2 years

About company

Name
NameMorningstar India Pvt Ltd
Job posted by Morningstar India Pvt Ltd

Similar jobs you can apply for

Manufacturing / Production
Omfurn India Limited

Engineering Trainee

Omfurn India Limited
Borivali East, Mumbai/Bombay
₹20,000 - ₹35,000
Work from Office
Full Time
Any experience
Basic English
Mcm Bpo Private Limited

AI Automation Specialist

Mcm Bpo Private Limited
Jogeshwari West, Mumbai/Bombay
₹10,000 - ₹57,000*
Work from Office
Full Time
Any experience
Basic English
Pioneer Refrigeration And Airconditioning Works/Dominion Oilfield Resources Rental

QA / QC Executive

Pioneer Refrigeration And Airconditioning Works/Dominion Oilfield Resources Rental
Byculla, Mumbai/Bombay
₹25,000 - ₹30,000
Work from Office
Full Time
Min. 2 years
Good (Intermediate / Advanced) English

Quality Control Engineer

Y J Associates
Mumbai/Bombay
₹50,000 - ₹90,000
Work from Office
Full Time
Min. 3 years
Basic English
WESMEC ENGINEERING PVT LTD

Trainee Engineer

WESMEC ENGINEERING PVT LTD
Andheri East, Mumbai/Bombay
₹18,000 - ₹25,000
Work from Office
Full Time
Min. 1 year
Basic English

DevOps Engineer

Ptap Delivery Solutions
Mumbai/Bombay
₹1,00,000 - ₹1,15,000
Work from Office
Full Time
Min. 3 years
Good (Intermediate / Advanced) English

You can expect a minimum salary of 0 INR. The salary offered will depend on your skills, experience and performance in the interview.

The candidate should have completed the required education and people who have 2 to 31 years are eligible to apply for this job. You can apply for more jobs in Mumbai/Bombay to get hired quickly.

The candidate should have sound communication skills and sound communication skills for this job.

Both Male and Female candidates can apply for this job.

No, it's not a work from home job and can't be done online. You can explore and apply for other work from home jobs in Mumbai/Bombay at apna.

No work-related deposit needs to be made during your employment with the company.

Go to the apna app and apply for this job. Click on the apply button and call HR directly to schedule your interview.

The last date to apply for this job is . For more details, download apna app and find Full Time jobs in Mumbai/Bombay . Through apna, you can find jobs in 64 cities across India. Join NOW!

Machine Learning Engineer in Morningstar India Pvt Ltd | apna.co