AI Operations Engineer
Fulcrum Digital
Apply on company website
AI Operations Engineer
Fulcrum Digital
Pune
Not disclosed
Job Details
Job Description
AI Ops Engineer
Job Description — AI Ops Engineer (DevOps/MLOps/AIOps)
Who are we Fulcrum Digital is an agile and next-generation digital accelerating company providing digital transformation and technology services right from ideation to implementation. These services have applicability across a variety of industries, including banking & financial services, insurance, retail, higher education, food, healthcare, and manufacturing. Role summary We are seeking an AI Ops Engineer to run and improve the operational reliability of our AI/ML and GenAI platforms. You will own CI/CD enablement , create/maintain Infrastructure-as-Code, monitoring/alerting, incident triage, release readiness, operational automation, and cloud cost governance . This role is hands-on and technical, and operates within enterprise security controls (segregation of duties) —you will drive outcomes end-to-end, partnering with C&F Service Desk/Cloud/Network teams when elevated access is required.Key responsibilities
- CI/CD & deployments: build/maintain pipelines and deployment processes aligned to Cyber/compliance expectations.
- IaC & implementation packages: create/maintain Infrastructure-as-Code (e.g., Terraform/Bicep) and provide complete technical specs for approved execution.
- Operate & support production: monitor dashboards/logs, triage incidents, perform RCAs, maintain runbooks, and drive issue closure.
- Alert hygiene: reduce alert noise, tune rules/thresholds, and ensure alerts are actionable (severity, ownership, playbooks).
- Enterprise integrations & access issues: troubleshoot and drive fixes for items such as:
- Power Platform â Jira connector issues (incl. allowlisting/service tags)
- HTTPS connector compliance needs
- Citrix VDI / Netskope / proxy access blocks (e.g., VS Code, external tools)
- Cloud cost management (automation-first): monitor spend, implement tagging/controls, automate recurring cost reporting (avoid manual spreadsheet-heavy processes).
Working model (important)
- Due to segregation of duties and centralized governance , you may not have persistent elevated/admin access.
- Some changes (RBAC, network/security rules, proxy allowlists, certain provisioning) must be executed by C&F Service Desk/Cloud/Network/Security teams.
- Ownership in this role means: you diagnose, propose the fix, provide IaC/specs + validation steps, coordinate execution, and verify results .
Requirements
Required qualifications
- 3+ years in DevOps/SRE/Cloud Ops/MLOps/AIOps (or equivalent).
- Experience with at least one major cloud ( Azure and/or AWS ) in enterprise environments with restricted permissions.
- Hands-on experience with CI/CD , IaC , and observability/monitoring tools.
- Strong troubleshooting skills using logs/metrics/traces; scripting/automation with Python/PowerShell/Bash .
- Strong communication and responsiveness (timely acknowledgements, proactive follow-through).
Preferred
- Experience supporting AI/ML platforms or API-based services.
- Containers/Kubernetes and/or serverless exposure.
- FinOps experience (cost allocation, anomaly detection, optimization).
Experience Level
Mid LevelJob role
Work location
Pune City, India
Department
IT & Information Security
Role / Category
IT Security
Employment type
Full Time
Shift
Day Shift
Job requirements
Experience
Min. 3 years
About company
Name
Fulcrum Digital
Job posted by Fulcrum Digital
Apply on company website