Lead Artificial Intelligence Engineer
Ingram Micro India Private LimitedChennai
Not disclosed
Job Description
Lead AI Engineer
It's fun to work in a company where people truly BELIEVE in what they're doing!
Job Description:
Deployment & Infrastructure Management:
- Deploy, configure, and manage AI models, agentic systems, and supporting infrastructure in cloud (e.g., GCP) and on-premise environments.
- Implement and maintain CI/CD pipelines for AI/ML models and agentic applications (MLOps/Agent Ops).
- Manage and optimize cloud resources, ensuring cost-effectiveness and scalability for AI workloads.
- Collaborate with infrastructure teams to ensure network, storage, and compute resources meet the demands of AI systems.
Monitoring, Logging & Alerting:
- Develop and implement comprehensive monitoring, logging, and alerting solutions for AI agents and infrastructure to ensure high availability and performance.
- Proactively identify and address potential issues, performance bottlenecks, and anomalies in production AI systems.
- Track key operational metrics and create dashboards for system health and performance.
Incident Response & Troubleshooting:
- Provide operational support for production AI systems, including incident response, root cause analysis, and resolution of technical issues.
- Develop and maintain runbooks and standard operating procedures for common operational tasks and incident management.
- Participate in on-call rotations as needed to support critical AI services.
Automation & Operational Excellence:
- Automate routine operational tasks, deployment processes, and system maintenance activities using scripting (e.g., Python, Bash) and automation tools.
- Contribute to the development and enforcement of operational best practices, security standards, and compliance requirements for AI systems.
- Work with development teams to improve the deployability, manageability, and observability of AI applications.
Collaboration & Documentation:
- Collaborate effectively with AI developers, data scientists, AI architects, and other stakeholders to ensure smooth transitions from development to production.
- Maintain clear and comprehensive documentation for system configurations, operational procedures, and troubleshooting guides.
- Provide feedback to development teams on operational aspects and system performance.
Preferred Qualifications & Experience:
- Bachelor’s degree in Computer Science, Information Technology, Engineering, or a related technical field.
- 4-7+ years of experience in a MLOps or Agent Ops role, preferably supporting AI/ML or data-intensive applications.
- Hands-on experience with cloud computing platforms (e.g., Google Cloud Platform - especially Vertex AI) and managing cloud-based infrastructure.
- Proficiency in scripting languages such as Python, Bash, or PowerShell for automation.
- Experience with CI/CD tools and practices (e.g., Bitbucket, GitLab CI, GitHub Actions).
- Familiarity with containerization technologies (e.g., Docker, Kubernetes) and orchestration.
- Experience with monitoring and logging tools (e.g., Prometheus, Grafana, ELK Stack, Datadog, Google Cloud Monitoring, Langfuse).
- Understanding of networking concepts, security best practices, and infrastructure-as-code (IaC) principles (e.g., Terraform, Ansible).
- Strong troubleshooting and problem-solving skills with an analytical mindset.
- Excellent communication skills and ability to work collaboratively in a team environment.
- A proactive approach to identifying and resolving issues and improving system reliability.
- Master's degree in a relevant field.
- Specific experience in MLOps or Agent Ops, including deploying and managing machine learning models or large language model applications in production.
- Familiarity with AI/ML frameworks and libraries (e.g., TensorFlow, PyTorch, scikit-learn).
- Understanding of agentic AI concepts and the operational challenges they present.
- Experience with managing vector databases or other specialized data stores for AI.
- Knowledge of data pipeline tools (e.g., Apache Airflow, Kubeflow Pipelines).
- Relevant cloud certifications (e.g., Google Cloud Professional ML Engineer).
- Experience working in an agile development environment.
Why Join Us?
Play a critical role in operationalizing cutting-edge Agentic AI and AI systems for a global industry leader.
- Gain hands-on experience with the latest MLOps, Agent Ops, and cloud technologies.
- Work in a dynamic, innovative, and collaborative AI Center of Excellence.
- Opportunity to significantly impact the reliability and efficiency of transformative AI solutions.
- Competitive salary, bonus, and benefits package.
Experience Level
Senior LevelJob role
Work locationChennai IMPT, India
DepartmentSoftware Engineering
Role / CategorySoftware Backend Development
Employment typeFull Time
ShiftDay Shift
Job requirements
ExperienceMin. 4 years
About company
NameIngram Micro India Private Limited
Job posted by Ingram Micro India Private Limited
Similar jobs you can apply for
Business DevelopmentSenior Business Development Manager
Aarthi AssociatesMylapore, Chennai
₹32,000 - ₹85,000
Collection Executive / Officer
Speed Credit ServicesChennai
₹15,000 - ₹23,000*
Business Development Manager (BDM)
Aarthi AssociatesMylapore, Chennai
₹32,000 - ₹85,000
Telecaller
Future XpressAmbattur Industrial Estate, Chennai
₹14,000 - ₹24,000*
Manager
Aarthi AssociatesMylapore, Chennai
₹40,000 - ₹90,000
Business Development Manager (BDM)
Aarthi AssociatesMylapore, Chennai
₹32,000 - ₹90,000