System Reliability Engineer

Fulcrum Digital

Pune

Not disclosed

Work from Office

Full Time

Min. 5 years

Job Details

Job Description

Sr System Reliability Engineer (Application Support)

Who are we
Fulcrum Digital is an agile and next-generation digital accelerating company providing digital transformation and technology services right from ideation to implementation. These services have applicability across a variety of industries, including banking & financial services, insurance, retail, higher education, food, healthcare, and manufacturing.

The Role

·        Plan, manage, and oversee all aspects of a Production Environment 

·        Define strategies for Application Performance Monitoring, Optimization in Prod environment

·        Respond to Incidents and improvise platform based on feedback and measure the reduction of incidents over time.

·        Support deployment of code into multiple lower environments.  Supporting current processes with an emphasis on automating everything as soon as possible.

·        Design, develop and standardize Monitoring and Alerting mechanism for the supported applications.

·        Take a holistic approach to problem solving, by connecting the dots during a production event through the various technology stack that makes up the platform, to optimize meantime to recover.

·        Engage in and improve the whole lifecycle of services—from inception and design, through deployment, operation and refinement.

·        Analyze ITSM activities of the platform and provide feedback loop to development teams on operational gaps or resiliency concerns.

·        Support services before they go live through activities such as system design consulting, capacity planning and launch reviews.

·        Support the application CI/CD pipeline for promoting software into higher environments through validation and operational gating, and lead  in DevOps automation and best practices.

·        Maintain services once they are live by measuring and monitoring availability, latency and overall system health.

·        Scale systems sustainably through mechanisms like automation and evolving systems by pushing for changes that improve reliability and velocity.

·        Work with a global team spread across tech hubs in multiple geographies and time zones.

·        Ability to share knowledge and explain processes and procedures to others.

·        Share knowledge and mentor junior resources

·        Able to perform on-call duties on a rotational basis.

·        Occasional off hours work required.

·        Candidate should have an inclination for Training and should be a good trainer and ready to mentor others



Requirements

Skills –

Must Have:

·        Linux

·        Shell Scripting

·        ITIL / ITSM, Application Troubleshooting

·        SQL

·        Any Monitoring tool (Preferred Splunk/Dynatrace)

·        Jenkins - CI/CD – Basic

Good To Have:

·        Groovy Scripting/Yaml

·        Git basic/bit bucket

·        Ansible/Chef

·        Even Framework architecture



Experience Level

Senior Level

Job role

Work location

Pune, India

Department

IT & Information Security

Role / Category

IT Support

Employment type

Full Time

Shift

Day Shift

Job requirements

Experience

Min. 5 years

About company

Name

Fulcrum Digital

Job posted by Fulcrum Digital

Apply on company website