Senior Python Developer - AI Evaluation Frameworks
Michelin India Private LimitedJob Description
Python Developer (AI Evaluation Frameworks)
Python Developer (AI Evaluation Frameworks)- - - - - - - - - - - -
Job summary:
We are seeking a seasoned Python Engineer with 5–7 years of professional experience and exposure to QA practices to join our team focused on development of AI evaluation frameworks. The ideal candidate combines hands‑on Python engineering skills, a QA mindset, and practical familiarity with GenAI/LLM concepts and Azure cloud services. You will design, build, and maintain scalable evaluation systems and work closely with QA teams and stakeholders to ensure robust, repeatable assessment of AI components.
Key responsibilities
Design and implement AI evaluation frameworks and tooling for model assessment, benchmarking, and automated testing of LLMs, agents, and GenAI features.
Build production‑grade Python applications, API’s to support evaluation pipelines and integrations.
Collaborate with QA team brainstorm current evaluation challenges and build reproducible evaluation workflows.
Implement end‑to‑end evaluation pipelines including data preprocessing, metric computation, test orchestration, and reporting.
Ensure code quality and maintain coding standards through static analysis, unit/integration tests, code reviews, and tooling (e.g., SonarQube).
Contribute to design and implementation of APIs and services.
Deploy and operate evaluation components on Azure, leveraging platform services and following infrastructure‑as‑code practices.
Instrument monitoring, logging, and alerting for evaluation pipelines; capture audit trails and results for compliance and reproducibility.
Partner with data scientists, ML engineers, and product stakeholders to gather requirements, validate evaluation approaches, and incorporate feedback.
Support peers in troubleshooting and resolving issues across development and QA; mentor junior developers and share best practices.
Maintain documentation for evaluation frameworks, runbooks etc.
Unit tests and unit plans are built, executed, optimized, monitored, ensuring quality, security and consistency. Malfunctions, incidents and bugs are detected, understood, analyzed, reported and solved.
Required qualifications
5–7 years of professional Python development experience with strong, demonstrable hands‑on skills.
Solid understanding of OOPs concepts, software design principles, and coding best practices.
Experience with test‑driven development, writing unit and integration tests, and collaborating with QA teams on automated testing.
Familiarity with the full project lifecycle: requirements, design, development, code review, deployment, maintenance, and deprecation.
Experience building RESTful APIs using FastAPI, Flask, or Django.
Practical experience with Azure cloud services and deployment patterns (App Services, AKS, Azure Functions, Blob/Storage, DevOps pipelines).
Exposure to CI/CD tooling and code quality tools such as SonarQube
Working knowledge of AI/DS concepts—particularly GenAI, LLMs, RAG patterns, and agent architectures.
Strong problem solving, debugging skills, and ability to work across distributed systems.
Excellent communication skills and demonstrated ability to work closely with QA, data science, and product teams.
Desirable (good‑to‑have)
Experience with LLM frameworks such as LangChain, LlamaIndex, or similar.
Familiarity with observability tools and ML/LLM monitoring.
Prior experience designing evaluation metrics for NLP/LLM tasks (e.g., BLEU/ROUGE, embeddings‑based similarity, human evaluation orchestration).
Prior knowledge and experience of working on traditional AI-ML systems.
Behavioral competencies
Mindset: attention to detail, attention towards testability and reproducibility, and strong focus on accuracy, quality and safety.
Collaborative: able to partner effectively with QA, ML, and product stakeholders.
Proactive communicator: gathers feedback, surfaces risks early, and drives adoption of evaluation tooling.
Mentorship orientation: supports and uplifts team members through knowledge sharing.
Experience Level
Senior LevelJob role
Job requirements
About company
Similar jobs you can apply for
Software / Web Developer
Software Tester
Wyse Biometrics Systems Private LimitedQuality Engineer
Eco Tech Engineers
QA / QC Executive
Biovision Process Engineering Pvt. Ltd.Salesforce Developer
THE NaukriWalaQuality Engineer
Nigasavi Solutions LLPJava Developer
THE NaukriWalaYou can expect a minimum salary of 0 INR. The salary offered will depend on your skills, experience and performance in the interview.
The candidate should have completed the required education and people who have 5 to 7 years are eligible to apply for this job. You can apply for more jobs in Pune to get hired quickly.
The candidate should have sound communication skills and sound communication skills for this job.
Both Male and Female candidates can apply for this job.
No, it's not a work from home job and can't be done online. You can explore and apply for other work from home jobs in Pune at apna.
No work-related deposit needs to be made during your employment with the company.
Go to the apna app and apply for this job. Click on the apply button and call HR directly to schedule your interview.
The last date to apply for this job is . For more details, download apna app and find Full Time jobs in Pune . Through apna, you can find jobs in 64 cities across India. Join NOW!