Chaos Engineer
Fulcrum Digital
Apply on company website
Chaos Engineer
Fulcrum Digital
Pune
Not disclosed
Job Details
Job Description
Sr Chaos Engineer
Who are we
Fulcrum Digital is an agile and next-generation digital accelerating company providing digital transformation and technology services right from ideation to implementation. These services have applicability across a variety of industries, including banking & financial services, insurance, retail, higher education, food, healthcare, and manufacturing. Key Responsibilities: Chaos Engineering: · Design and implement chaos engineering experiments to identify weaknesses in systems and applications. · Develop and execute strategies to improve system resilience and reliability. · Analyze experiment results, provide actionable insights, and drive remediation efforts. · Collaborate with development, operations, and infrastructure teams to integrate chaos engineering practices. Operational Acceptance: · Develop and maintain comprehensive operational acceptance criteria for new and existing systems. · Conduct thorough operational acceptance testing, ensuring systems meet all predefined criteria before go-live. · Work closely with project managers, developers, and QA teams to align operational acceptance processes with project timelines and objectives. · Document and communicate operational readiness findings, providing recommendations for improvement. System Resilience and Reliability: · Implement and manage strategies for continuous improvement of system resilience and reliability. · Monitor and assess system performance, identifying potential risks and areas for enhancement. · Lead initiatives to improve disaster recovery and business continuity plans. · Stay updated with the latest industry trends and best practices in chaos engineering and operational acceptance. Collaboration and Training: · Educate and mentor team members on chaos engineering and operational acceptance methodologies. · Foster a culture of resilience and reliability within the organization. · Engage with external communities, attending conferences and participating in knowledge-sharing events.Requirements
Extensive experience in chaos engineering, operational acceptance testing, and system resilience. Strong understanding of cloud platforms (AWS, Azure, GCP) and their resilience features. Proficiency in scripting and automation tools (Python, Bash, Terraform, etc.). Experience with monitoring and observability tools (Prometheus, Grafana, Splunk, etc.). Experience with Chaos Engineering Tools such as Gremlin, Chaos Monkey etc., Excellent analytical and problem-solving skills. Strong communication and collaboration skills, with the ability to work effectively in cross-functional teams. Certifications in relevant fields (e.g., AWS Certified Solutions Architect, Azure DevOps Engineer) are a plus.Experience Level
Senior LevelJob role
Work location
Pune, India
Department
IT & Information Security
Role / Category
IT Security
Employment type
Full Time
Shift
Day Shift
Job requirements
Experience
Min. 5 years
About company
Name
Fulcrum Digital
Job posted by Fulcrum Digital
Apply on company website