Site Reliability Engineer
Oracle India Private LimitedJob Description
Site Reliability Developer 3
At Oracle Cloud Infrastructure (OCI), we are building the future of cloud for enterprises with the agility of a startup and the scale of a global enterprise leader. Compute is one of OCI’s foundational organizations, responsible for delivering the core infrastructure powering Virtual Machines (VMs) and Bare Metal (BM) services.
As an OCI Site Reliability Engineer (SRE), you will work closely with development and product teams in a shared full-stack ownership model across multiple services and technology domains. You will develop deep expertise in service architecture, dependencies, configurations, and operational behavior across large-scale production environments.
You will be responsible for improving the reliability, scalability, performance, and operational efficiency of OCI Compute services. The role includes handling critical customer incidents, supporting deployments, performing validation and operational testing, troubleshooting complex infrastructure issues, conducting root cause analysis (RCA), and driving service reliability improvements.
You will act as a key escalation point for complex production issues, leveraging strong knowledge of distributed systems, service topology, and infrastructure dependencies to identify mitigations and restore service health while partnering with development teams to meet SLA commitments.
The role also involves leveraging AIOps and intelligent automation to enhance monitoring, anomaly detection, event correlation, predictive alerting, RCA, and remediation workflows. Using observability platforms, telemetry analytics, and automation frameworks, you will help reduce operational toil, improve incident response, and enhance overall service reliability.
This is an opportunity to combine deep technical expertise with operational excellence to solve complex cloud infrastructure challenges at massive scale within Oracle’s next-generation cloud platform.
Install, monitor, maintain, support, and optimize all production server hardware and software. Provide escalated technical support for complex technical issues which may include leading problem management cases and providing management status. Coordinate escalated support cases and lead appropriate internal technical resources and/or third-party vendors to resolution and coordinate a storage infrastructure of Oracle systems and database appliances.
Responsible for Oracle production environments; assist with server operating system and application upgrades, bug fixes, patching, and deployment activities; and work on standardization projects for both hardware and software under the Oracle technology stack while providing consistent system uptime as expected in a Cloud environment. Leverage AIOps, observability platforms, telemetry analytics, intelligent automation, event correlation, predictive alerting, and automated remediation workflows to improve operational efficiency, incident response, service reliability, and reduce operational toil.
Provide on-call support, on a rotating basis. Responsibilities include but not limited to:
Strong programming/scripting skills in Python, Java, or Go are preferred for automation, tooling, debugging, and operational engineering initiatives.
Strong hands-on experience with Enterprise Linux operating systems in large-scale production environments.
Incident Management,
Support and troubleshooting of Staging/Production environments
- Participate in 24x7 On-Call rotations for multiple services and demonstrate operational flexibility in shift-based environments
- Drive and maintain high availability, scalability, reliability, and operational excellence of cloud services. Perform Root Cause Analysis (RCA) and implement corrective and preventive actions for recurring issues
- Test, validate, and deploy solutions while automating manual operational processes
Build and maintain deployment tools, CI/CD pipelines, operational procedures, and automation frameworks - Drive zero-downtime deployments with a strong high-availability and reliability-first mindset
Support infrastructure upgrades, patching, change management, and production rollout activities - Leverage AIOps, observability platforms, telemetry analytics, intelligent automation, event correlation, anomaly detection, predictive alerting, and automated remediation workflows to improve operational efficiency and incident response
- Define and build scalable operational solutions around infrastructure, cloud migration, and distributed systems operations
- Work closely with development and service teams to troubleshoot complex issues requiring code-level analysis and deep understanding of service dependencies
- Ensure strong production security posture, operational compliance, and infrastructure reliability
Support capacity planning, performance optimization, system tuning, and scalability initiatives across OCI environments
Career Level - IC3
Only Oracle brings together the data, infrastructure, applications, and expertise to power everything from industry innovations to life-saving care. And with AI embedded across our products and services, we help customers turn that promise into a better future for all. Discover your potential at a company leading the way in AI and cloud solutions that impact billions of lives.
True innovation starts when everyone is empowered to contribute. That’s why we’re committed to growing a workforce that promotes opportunities for all with competitive benefits that support our people with flexible medical, life insurance, and retirement options. We also encourage employees to give back to their communities through our volunteer programs.
We’re committed to including people with disabilities at all stages of the employment process. If you require accessibility assistance or accommodation for a disability at any point, let us know by emailing accommodation-request_mb@oracle.com or by calling 1-888-404-2494 in the United States.
Oracle is an Equal Employment Opportunity Employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, national origin, sexual orientation, gender identity, disability and protected veterans’ status, or any other characteristic protected by law. Oracle will consider for employment qualified applicants with arrest and conviction records pursuant to applicable law.
Experience Level
Senior LevelJob role
Job requirements
About company
Similar jobs you can apply for
Receptionist / Front Office / Help DeskFront Desk Associate
Amrita Homeopathy
Franchise Operations Manager
The Kenko LifeStaff Nurse
Popular Nursing HomeShift Incharge
Superliora LogisticsService Assistant
SLS Trading CompanyField Worker
Dravya EnterprisesYou can expect a minimum salary of 0 INR. The salary offered will depend on your skills, experience and performance in the interview.
The candidate should have completed the required education and people who have 5 to 31 years are eligible to apply for this job. You can apply for more jobs in Bengaluru/Bangalore to get hired quickly.
The candidate should have sound communication skills and sound communication skills for this job.
Both Male and Female candidates can apply for this job.
No, it's not a work from home job and can't be done online. You can explore and apply for other work from home jobs in Bengaluru/Bangalore at apna.
No work-related deposit needs to be made during your employment with the company.
Go to the apna app and apply for this job. Click on the apply button and call HR directly to schedule your interview.
The last date to apply for this job is . For more details, download apna app and find Full Time jobs in Bengaluru/Bangalore . Through apna, you can find jobs in 64 cities across India. Join NOW!