Software Engineer - Digital Operations

Fractal Analytics Private Limited

Mumbai/Bombay

Not disclosed

Work from Office

Full Time

Min. 5 years

Job Details

Job Description

Digi ops Engineer

It's fun to work in a company where people truly BELIEVE in what they are doing!

We're committed to bringing passion and customer focus to the business.

Software Engineer – Digital Operations (5+ Years)

Location

Mumbai (Hybrid)

Employment Type

Full-time

About the Role

We’re looking for a Software Engineer in Digital Operations who blends strong software engineering with operational excellence. You’ll build reliable automation, improve observability, reduce toil, and drive incident & problem management for our customer-facing platforms. You’ll work closely with Product, SRE/Platform, and Engineering teams to keep systems available, scalable, secure, and cost‑efficient.

Key Responsibilities

Reliability & Operations

  • Own day-to-day production operations for critical services (availability, performance, capacity).
  • Build and maintain observability: metrics, logs, traces, SLOs/error budgets, alerting runbooks.
  • Lead incident response (L1/L2/L3): triage, communication, mitigation, and post‑incident RCAs with action items.
  • Drive problem management: identify recurring issues, eliminate root causes, and reduce MTTR.

Automation & Tooling

  • Design and implement automation (Python/Go/Node/PowerShell) to reduce manual toil and improve deployment, rollback, and maintenance workflows.
  • Develop self‑service tooling/portals for internal teams (e.g., restart service, run diagnostics, provisioning).
  • Build CI/CD integrations (GitHub Actions/Azure DevOps/Jenkins) and infrastructure as code (Terraform/ARM/CloudFormation).

Platform & Cloud

  • Operate services on AWS/Azure/GCP (compute, container orchestration—Kubernetes/EKS/AKS/GKE, serverless), networking, and security best practices.
  • Optimize costs (rightsizing, autoscaling, spot/reserved instances, storage lifecycle policies).

Data & Integrations

  • Create operational dashboards (Grafana/Datadog/CloudWatch/Prometheus/New Relic).
  • Write SQL for diagnostics and build data pipelines for operational insights.
  • Integrate with ITSM (ServiceNow/Jira) for incident/change/problem workflows; follow ITIL practices where applicable.

Governance & Continuous Improvement

  • Maintain runbooks, SOPs, and knowledge base articles.
  • Define and track KPIs (availability, MTTR, change failure rate, alert noise).
  • Champion security & compliance in operations (secrets mgmt, patching, vulnerability remediation).
  • Mentor junior engineers and participate in on‑call rotations.

Required Qualifications

  • 5–7 years in software engineering, DevOps, SRE, or production operations.
  • Strong coding in Python (preferred) or Go/Node.js/Java, with a focus on automation.
  • Solid knowledge of Linux, networking (HTTP, TLS, DNS), and containerization (Docker, Kubernetes basics).
  • Experience with observability stacks (Prometheus/Grafana/ELK/Datadog/New Relic) and alerting design.
  • Hands-on with CI/CD (GitHub Actions, Jenkins, Azure DevOps) and IaC (Terraform).
  • Practical experience on at least one major cloud (AWS/Azure/GCP).
  • Proven track record in incident management and postmortems with measurable improvements.
  • Proficiency with SQL for troubleshooting; familiarity with message queues (Kafka/RabbitMQ) is a plus.

Nice to Have

  • SRE practices: SLOs, error budgets, chaos testing.
  • Security: IAM, Vault/KeyVault/KMS, secrets scanning, compliance (SOC2/ISO 27001).
  • Databases: Postgres/MySQL, NoSQL (MongoDB/Redis), performance tuning basics.
  • Platform: Helm, ArgoCD, service meshes (Istio/Linkerd).
  • ** ITIL v3/v4** exposure, ServiceNow automations.
  • GenAI in Ops: ChatOps/LLM-assisted troubleshooting, auto-remediation playbooks.

If you like wild growth and working with happy, enthusiastic over-achievers, you'll enjoy your career with us!

Not the right fit?  Let us know you're interested in a future opportunity by clicking Introduce Yourself in the top-right corner of the page or create an account to set up email alerts as new job postings become available that meet your interest!

Experience Level

Senior Level

Job role

Work location

Mumbai, India

Department

Software Engineering

Role / Category

Software Development

Employment type

Full Time

Shift

Day Shift

Job requirements

Experience

Min. 5 years

About company

Name

Fractal Analytics Private Limited

Job posted by Fractal Analytics Private Limited

Apply on company website