Software Engineer - Digital Operations
Fractal Analytics Private Limited
Apply on company website
Software Engineer - Digital Operations
Fractal Analytics Private Limited
Mumbai/Bombay
Not disclosed
Job Details
Job Description
Digi ops Engineer
It's fun to work in a company where people truly BELIEVE in what they are doing!
We're committed to bringing passion and customer focus to the business.
Software Engineer – Digital Operations (5+ Years)
Location
Mumbai (Hybrid)
Employment Type
Full-time
About the Role
We’re looking for a Software Engineer in Digital Operations who blends strong software engineering with operational excellence. You’ll build reliable automation, improve observability, reduce toil, and drive incident & problem management for our customer-facing platforms. You’ll work closely with Product, SRE/Platform, and Engineering teams to keep systems available, scalable, secure, and cost‑efficient.
Key Responsibilities
Reliability & Operations
- Own day-to-day production operations for critical services (availability, performance, capacity).
- Build and maintain observability: metrics, logs, traces, SLOs/error budgets, alerting runbooks.
- Lead incident response (L1/L2/L3): triage, communication, mitigation, and post‑incident RCAs with action items.
- Drive problem management: identify recurring issues, eliminate root causes, and reduce MTTR.
Automation & Tooling
- Design and implement automation (Python/Go/Node/PowerShell) to reduce manual toil and improve deployment, rollback, and maintenance workflows.
- Develop self‑service tooling/portals for internal teams (e.g., restart service, run diagnostics, provisioning).
- Build CI/CD integrations (GitHub Actions/Azure DevOps/Jenkins) and infrastructure as code (Terraform/ARM/CloudFormation).
Platform & Cloud
- Operate services on AWS/Azure/GCP (compute, container orchestration—Kubernetes/EKS/AKS/GKE, serverless), networking, and security best practices.
- Optimize costs (rightsizing, autoscaling, spot/reserved instances, storage lifecycle policies).
Data & Integrations
- Create operational dashboards (Grafana/Datadog/CloudWatch/Prometheus/New Relic).
- Write SQL for diagnostics and build data pipelines for operational insights.
- Integrate with ITSM (ServiceNow/Jira) for incident/change/problem workflows; follow ITIL practices where applicable.
Governance & Continuous Improvement
- Maintain runbooks, SOPs, and knowledge base articles.
- Define and track KPIs (availability, MTTR, change failure rate, alert noise).
- Champion security & compliance in operations (secrets mgmt, patching, vulnerability remediation).
- Mentor junior engineers and participate in on‑call rotations.
Required Qualifications
- 5–7 years in software engineering, DevOps, SRE, or production operations.
- Strong coding in Python (preferred) or Go/Node.js/Java, with a focus on automation.
- Solid knowledge of Linux, networking (HTTP, TLS, DNS), and containerization (Docker, Kubernetes basics).
- Experience with observability stacks (Prometheus/Grafana/ELK/Datadog/New Relic) and alerting design.
- Hands-on with CI/CD (GitHub Actions, Jenkins, Azure DevOps) and IaC (Terraform).
- Practical experience on at least one major cloud (AWS/Azure/GCP).
- Proven track record in incident management and postmortems with measurable improvements.
- Proficiency with SQL for troubleshooting; familiarity with message queues (Kafka/RabbitMQ) is a plus.
Nice to Have
- SRE practices: SLOs, error budgets, chaos testing.
- Security: IAM, Vault/KeyVault/KMS, secrets scanning, compliance (SOC2/ISO 27001).
- Databases: Postgres/MySQL, NoSQL (MongoDB/Redis), performance tuning basics.
- Platform: Helm, ArgoCD, service meshes (Istio/Linkerd).
- ** ITIL v3/v4** exposure, ServiceNow automations.
- GenAI in Ops: ChatOps/LLM-assisted troubleshooting, auto-remediation playbooks.
If you like wild growth and working with happy, enthusiastic over-achievers, you'll enjoy your career with us!
Not the right fit? Let us know you're interested in a future opportunity by clicking Introduce Yourself in the top-right corner of the page or create an account to set up email alerts as new job postings become available that meet your interest!
Experience Level
Senior LevelJob role
Work location
Mumbai, India
Department
Software Engineering
Role / Category
Software Development
Employment type
Full Time
Shift
Day Shift
Job requirements
Experience
Min. 5 years
About company
Name
Fractal Analytics Private Limited
Job posted by Fractal Analytics Private Limited
Apply on company website