Site Reliability Engineer

Accenture India Private Limited

Bengaluru/Bangalore

Not disclosed

Work from Office

Full Time

Min. 5 years

Job Details

Job Description

Custom Software Engineer

Project Role : Custom Software Engineer
Project Role Description : Develop custom software solutions to design, code, and enhance components across systems or applications. Use modern frameworks and agile practices to deliver scalable, high-performing solutions tailored to specific business needs.
Must have skills : AWS Administration
Good to have skills : NA
Minimum 3 year(s) of experience is required
Educational Qualification : 15 years full time education

Job Description – Site Reliability Engineer (SRE) – Experienced
Role Overview
We are looking for an experienced Site Reliability Engineer (SRE) who can drive reliability, automation, and operational excellence across production environments. The ideal candidate should possess strong technical expertise, excellent communication skills, and the confidence to lead incident response, collaborate with stakeholders, and influence engineering decisions.


Key Responsibilities

Reliability & Availability

Ensure high availability, performance, and resilience of critical services and platforms.
Design, implement, and improve monitoring, alerting, and observability systems.
Identify reliability risks and proactively mitigate them.

Incident Management

Lead and manage major incidents (P1/P2) with a calm and structured approach.
Drive root cause analysis (RCA) and implement long-term corrective actions.
Act as a bridge between engineering, product, and business teams during incidents.

Automation & Tooling

Reduce operational toil through automation and self-service tools.
Build and maintain scripts, workflows, and automation frameworks.
Improve deployment, rollback, and recovery processes.

Collaboration & Communication
Work closely with development, cloud, and operations teams.
Communicate clearly and confidently with leadership, stakeholders, and clients.
Present technical insights in a business-friendly manner.

Performance & Capacity Management

Analyze system performance trends and plan for capacity needs.
Optimize infrastructure costs while maintaining reliability.

Cloud & Infrastructure

Manage and troubleshoot cloud environments (AWS/Azure/GCP preferred).
Work with container platforms (Kubernetes/Docker) and CI/CD pipelines.

Required Skills & Experience

5+ years of experience in SRE, DevOps, or Production Support roles.
Strong experience with Linux, networking, and distributed systems.
Hands-on experience with monitoring tools (Splunk, Grafana, Prometheus, AppDynamics, etc.).
Proficiency in at least one scripting language (Python, Bash, Go, or similar).
Experience with cloud platforms (AWS preferred).
Good understanding of CI/CD, containers, and infrastructure as code (Terraform/Ansible).
Excellent verbal and written communication skills.
Demonstrated ability to lead incidents and work under pressure.
Confident decision-making and stakeholder management skills.

Preferred Qualifications

Experience in enterprise-scale production environments.
Familiarity with ITIL processes and incident/problem/change management.
Experience building SRE dashboards, runbooks, and playbooks.
Knowledge of FinOps, observability best practices, and reliability engineering principles.

Personal Attributes

Strong problem-solving mindset.
Confident communicator with leadership presence.
Team player with a collaborative attitude.
Ability to mentor junior engineers.
Comfortable working in 24/7 operational environments.

Job role

Work location

Bengaluru

Department

IT & Information Security

Role / Category

IT Security

Employment type

Full Time

Shift

Day Shift

Job requirements

Experience

Min. 5 years

About company

Name

Accenture India Private Limited

Job posted by Accenture India Private Limited

Apply on company website