AI Operations Engineer

Fulcrum Digital

Pune

Not disclosed

Work from Office

Full Time

Min. 3 years

Job Details

Job Description

AI Ops Engineer

Job Description — AI Ops Engineer (DevOps/MLOps/AIOps)

Who are we Fulcrum Digital is an agile and next-generation digital accelerating company providing digital transformation and technology services right from ideation to implementation. These services have applicability across a variety of industries, including banking & financial services, insurance, retail, higher education, food, healthcare, and manufacturing. Role summary We are seeking an  AI Ops Engineer  to run and improve the operational reliability of our AI/ML and GenAI platforms. You will own  CI/CD enablement , create/maintain Infrastructure-as-Code, monitoring/alerting, incident triage, release readiness, operational automation, and cloud cost governance . This role is hands-on and technical, and operates within  enterprise security controls (segregation of duties) —you will drive outcomes end-to-end, partnering with C&F Service Desk/Cloud/Network teams when elevated access is required.

Key responsibilities

  • CI/CD & deployments:  build/maintain pipelines and deployment processes aligned to Cyber/compliance expectations.
  • IaC & implementation packages:  create/maintain Infrastructure-as-Code (e.g., Terraform/Bicep) and provide complete technical specs for approved execution.
  • Operate & support production:  monitor dashboards/logs, triage incidents, perform RCAs, maintain runbooks, and drive issue closure.
  • Alert hygiene:  reduce alert noise, tune rules/thresholds, and ensure alerts are actionable (severity, ownership, playbooks).
  • Enterprise integrations & access issues:  troubleshoot and drive fixes for items such as:
    • Power Platform ↔ Jira connector issues (incl. allowlisting/service tags)
    • HTTPS connector compliance needs
    • Citrix VDI / Netskope / proxy access blocks (e.g., VS Code, external tools)
  • Cloud cost management (automation-first):  monitor spend, implement tagging/controls, automate recurring cost reporting (avoid manual spreadsheet-heavy processes).

Working model (important)

  • Due to  segregation of duties and centralized governance , you may not have persistent elevated/admin access.
  • Some changes (RBAC, network/security rules, proxy allowlists, certain provisioning) must be executed by  C&F Service Desk/Cloud/Network/Security  teams.
  • Ownership in this role means:  you diagnose, propose the fix, provide IaC/specs + validation steps, coordinate execution, and verify results .

Requirements

Required qualifications

  • 3+ years in  DevOps/SRE/Cloud Ops/MLOps/AIOps  (or equivalent).
  • Experience with at least one major cloud ( Azure and/or AWS ) in enterprise environments with restricted permissions.
  • Hands-on experience with  CI/CD IaC , and  observability/monitoring  tools.
  • Strong troubleshooting skills using logs/metrics/traces; scripting/automation with  Python/PowerShell/Bash .
  • Strong communication and responsiveness (timely acknowledgements, proactive follow-through).

Preferred

  • Experience supporting AI/ML platforms or API-based services.
  • Containers/Kubernetes and/or serverless exposure.
  • FinOps experience (cost allocation, anomaly detection, optimization).

Experience Level

Mid Level

Job role

Work location

Pune City, India

Department

IT & Information Security

Role / Category

IT Security

Employment type

Full Time

Shift

Day Shift

Job requirements

Experience

Min. 3 years

About company

Name

Fulcrum Digital

Job posted by Fulcrum Digital

Apply on company website