Equinix

Senior Staff Engineer - AIOps and Machine Learning Automation

Equinix
Bengaluru/Bangalore
Not disclosed
Work from OfficeWork from Office
Full TimeFull Time
Min. 8 yearsMin. 8 years

Job Description

Staff Engineer – Agentic AIOps (MCP, Context Engineering, LLM Automation)

Who are we?

Equinix is the world’s digital infrastructure company®, shortening the path to connectivity to enable the innovations that enrich our work, life and planet. 
 

A place where bold ideas are welcomed, human connection is valued, and everyone has the opportunity to shape their future.

Help us challenge assumptions, uncover bias, and remove barriers—because progress starts with fresh ideas. You’ll find belonging, purpose, and a team that welcomes you—because when you feel valued, you’re empowered to do your best work.

Job Summary

We are looking for highly skilled Staff Engineer (AIOps) to design, build, and scale the next generation of intelligent operational platforms for our ecosystem. These engineers will work at the intersection of SRE, machine learning, LLMs, observability, and automation, enabling predictive, autonomous operations across a globally distributed environment.

In this role, you will architect and implement AIOps capabilities such as intelligent incident routing, anomaly detection, operational copilots, ChatOps workflows, and automated remediation. You will partner closely with SRE, platform engineering, service management, and product teams to embed intelligence into operational workflows and redefine how digital operations are run.

This is a highly technical, hands-on role requiring strong depth in applied ML/LLMs, operational systems, automation frameworks, and observability data structures.

Responsibilities

AIOps Platform & Intelligence Development

  • Design and build AIOps models (LLMs or classical ML) for anomaly detection, correlation, root-cause identification, and intelligent event clustering.

  • Develop operational copilots and chatbots capable of responding to incidents, surfacing insights, and driving automation through natural language.

  • Build and maintain feature pipelines using telemetry, logs, metrics, traces, and runtime state for operational intelligence use cases.

  • Implement use cases for predictive and preventive operations—capacity forecasting, early warning systems, noisy neighbor detection, etc.

LLM Engineering & Applied AI

  • Build knowledge-grounding systems for operational copilots using runbooks, incident data, historical patterns, service maps, and topology.

  • Integrate LLM-based reasoning into observability and automation platforms.

  • Develop embeddings, retrieval systems (RAG), and intent classification for operational queries.

Automation & Intelligent Remediation

  • Build automated workflows for incident triage, diagnostics, collaboration, and remediation.

  • Architect closed-loop automation patterns connecting alerts → insights → action → verification.

  • Develop reusable automation modules with integration to unified observability, cloud platforms, and orchestration systems.

Data, Observability & Integration

  • Integrate AIOps models with observability platforms (logs, metrics, traces, events, topology).

  • Design real-time inference systems for high-volume telemetry streams.

  • Partner with SRE and platform teams to ensure pipelines, data contracts, and instrumentation support future AIOps workloads.

Operational Excellence & Collaboration

  • Work with transformation teams to define AIOps onboarding patterns, enablement models, and implementation guidelines.

  • Drive AIOps adoption across multiple products/platforms, ensuring reliability, scalability, and continuous improvement.

  • Participate in architecture reviews, data modeling discussions, and SRE transformation initiatives.

Qualifications

  • 8+ years of experience in SRE, platform engineering, ML engineering, data engineering, or AIOps-oriented roles.

  • Strong hands-on experience building ML or LLM-based systems with Python, PyTorch/TensorFlow, or modern LLM frameworks.

  • Experience building automation workflows using tools like StackStorm, Rundeck, Airflow, Jenkins, or cloud-native orchestration.

  • Deep understanding of observability data (logs, metrics, traces) and platforms like Datadog, Splunk, Prometheus, Grafana, ELK.

  • Experience designing and deploying RAG pipelines, embeddings, intent models, or operational chatbots.

  • Strong experience architecting streaming or event-driven systems (Kafka, Kinesis, Pub/Sub).

  • Familiarity with cloud-native systems, Kubernetes, microservices, and modern deployment patterns.

  • Excellent problem-solving skills with the ability to translate operational challenges into ML-based or automation-based solutions.

  • Ability to collaborate across SRE, platform, service management, and engineering teams.

Must Have Skills

  • Hands on Experience using Claude Code, Codex or GitHub Copilot any one of them

  • Good Understanding of Context Engineering

  • Understand the Agentic Harness frameworks

  • Have built one MCP server at least

Equinix is committed to ensuring that our employment process is open to all individuals, including those with a disability.  If you are a qualified candidate and need assistance or an accommodation, please let us know by completing this form.

Equinix is an Equal Employment Opportunity and, in the U.S., an Affirmative Action employer.  All qualified applicants will receive consideration for employment without regard to unlawful consideration of race, color, religion, creed, national or ethnic origin, ancestry, place of birth, citizenship, sex, pregnancy / childbirth or related medical conditions, sexual orientation, gender identity or expression, marital or domestic partnership status, age, veteran or military status, physical or mental disability, medical condition, genetic information, political / organizational affiliation, status as a victim or family member of a victim of crime or abuse, or any other status protected by applicable law. 

We use artificial intelligence in our hiring process. Learn more here.

Experience Level

Senior Level

Job role

Work location
Work locationBangalore Office BLS2, India
Department
DepartmentSoftware Engineering
Role / Category
Role / CategorySoftware Backend Development
Employment type
Employment typeFull Time
Shift
ShiftDay Shift

Job requirements

Experience
ExperienceMin. 8 years

About company

Name
NameEquinix
Job posted by Equinix

Similar jobs you can apply for

Accounts / Finance

Installation Project Engineer

Innovative Engineers
Bengaluru/Bangalore
₹25,000 - ₹45,000
Field Job
Full Time
Min. 2 years
Basic English

Field Engineer

Make Anew Electronics Private Limited
Srinivaspur, Bengaluru/Bangalore
₹20,000 - ₹30,000*
Field Job
Full Time
Any experience
Basic English
Elastic Run

Supervisor

Elastic Run
Domlur, Bengaluru/Bangalore
₹20,000 - ₹25,000
Work from Office
Full Time
Min. 1 year
Basic English
Liyra Consulting

Quality Analyst

Liyra Consulting
HSR Layout, Bengaluru/Bangalore
₹25,000 - ₹33,000*
Work from Office
Full Time
Min. 1 year
Good (Intermediate / Advanced) English
Dhruthi Aaradhya Design Studio

Factory Manager

Dhruthi Aaradhya Design Studio
SMV Layout, Bengaluru/Bangalore
₹40,000 - ₹65,000
Work from Office
Full Time
Min. 2 years
No English Required
Beth Living

Design Engineer

Beth Living
Harohalli, Bengaluru/Bangalore
₹22,000 - ₹35,000
Work from Office
Full Time
Min. 1 year
Basic English