Senior Site Reliability Engineer
Equifax Credit Information Services Pvt Ltd
Apply on company website
Senior Site Reliability Engineer
Equifax Credit Information Services Pvt Ltd
Thiruvananthapuram
Not disclosed
Job Details
Job Description
Senior Site Reliability-DevOps Engineer
Site Reliability Engineering (SRE) at Equifax is a discipline that combines software and systems engineering for building and running large-scale, distributed, fault-tolerant systems. SRE ensures that internal and external services meet or exceed reliability and performance expectations while adhering to Equifax engineering principles.
SRE is also an engineering approach to building and running production systems – we engineer solutions to operational problems. Our SREs are responsible for overall system operation and we use a breadth of tools and approaches to solve a broad set of problems. Practices such as limiting time spent on operational work, blameless postmortems, proactive identification, and prevention of potential outages.
What you’ll do
- Architecture and Design: Participate in the design and architecture of highly scalable, resilient, and secure systems on Kubernetes. Contribute to the definition of SRE principles and best practices.
- Automation: Develop and maintain automation frameworks for infrastructure provisioning, deployment, monitoring, and incident response using tools like Terraform, Ansible, Puppet, Chef, or similar.
- Monitoring and Alerting: Design and implement comprehensive monitoring and alerting systems to proactively identify and resolve issues. Develop and maintain dashboards to track key performance indicators (KPIs).
- Incident Management: Lead incident response efforts, conducting thorough post-incident reviews to identify root causes and implement preventative measures.
- Capacity Planning: Proactively identify and address capacity constraints to ensure optimal system performance and availability.
- Collaboration: Work closely with engineering, product, and security teams to ensure seamless collaboration and alignment on system requirements and priorities.
- Mentorship: Mentor and guide junior SRE/DevOps engineers, fostering a culture of continuous learning and improvement.
- On-call Rotation: Participate in a rotating on-call schedule to provide 24/7 support for critical systems.
- Security: Contribute to the security posture of our systems by implementing security best practices and participating in security audits and reviews.
- Performance Optimization: Identify and resolve performance bottlenecks, optimizing system performance and resource utilization.
What experience you need
- 7+ years of experience as an SRE, DevOps Engineer, or in a similar role.
- Deep understanding of cloud platforms such as GCP (AWS and Azure are a plus)
- Extensive experience with containerization technologies like Docker and Kubernetes.
- Proven experience with configuration management tools (e.g., Terraform, Ansible, Puppet, Chef).
- Strong scripting skills (e.g., Python, Go, Bash, Shell).
- Experience with monitoring and logging tools (e.g., DataDog, Prometheus, Grafana, Datadog, ELK stack).
- Experience with CI/CD pipelines and tools (e.g., Jenkins, GitLab CI, CircleCI).
- Experience with incident management and post-incident reviews.
- Excellent problem-solving and troubleshooting skills.
- Strong communication and collaboration skills.
- Bachelor's degree in Computer Science or a related field; equivalent experience considered.
What could set you apart
- Google cloud certifications.
- You have experience designing, analyzing and troubleshooting large-scale distributed systems.
- You take a system problem-solving approach, coupled with strong communication skills and a sense of ownership and drive
- You have experience managing Infrastructure as code via tools such as Terraform
- You are passionate about automation with a desire to eliminate toil whenever possible
- You’ve built software or maintained systems in a highly secure, regulated or compliant industry
- You thrive in and have experience and passion for working within a DevOps culture, and as part of a team.
Job role
Work location
Trivandrum
Department
IT & Information Security
Role / Category
IT Infrastructure Services
Employment type
Full Time
Shift
Day Shift
Job requirements
Experience
Min. 7 years
About company
Name
Equifax Credit Information Services Pvt Ltd
Job posted by Equifax Credit Information Services Pvt Ltd
Apply on company website