Application Monitoring Consultant
Yash Technologies Private LimitedJob Description
Consultant - Application Monitoring Job
Job Description:
We are seeking a Full-stack Infrastructure Observability Specialist to join the Infra and Operations Team. This role will focus on building and enabling end-to-end observability strategies across applications, infrastructure, and networks. A key responsibility is to design, implement, and optimize monitoring frameworks that leverage AIOps, automation, and cloud-native observability tools to deliver proactive insights, predictive analytics, and zero-downtime operations.
You will administer and integrate observability platforms, develop intelligent alerting and dashboarding, and collaborate with cross-functional teams to ensure resilient, scalable, and secure infrastructure.
Key Responsibilities
- Observability Strategy: Define and execute a full-stack observability roadmap aligned with business and IT goals, embedding AIOps and SRE principles.
- Monitoring Frameworks: Design and implement comprehensive monitoring solutions for applications, infrastructure, and networks to ensure continuous performance and availability.
- Data Analysis & Insights: Use AIOps-driven analytics to identify trends, predict failures, and automate corrective actions.
- Tool Ownership & Integration: Manage and optimize observability tools (Splunk, Datadog, Prometheus, Grafana, ThousandEyes, ServiceNow AIOps, etc.), integrating them across hybrid environments.
- Automation & Intelligence: Develop automated workflows for alerting, incident detection, and root cause analysis using scripting and AI-driven approaches.
- Dashboarding & Reporting: Build intelligent dashboards and provide actionable insights to stakeholders on system health, incidents, and performance improvements.
- Incident & Problem Management: Partner with ITSM teams to enhance detection, triage, and resolution workflows with AI-assisted root cause analysis.
- Continuous Improvement: Stay updated with emerging observability and AIOps technologies, integrating them to enhance monitoring capabilities.
Qualifications
- 5+ years in IT infrastructure, monitoring, or observability roles.
- Strong experience in AIOps platforms and applying AI/ML for monitoring, anomaly detection, and predictive analytics.
- Expertise with observability tools: Datadog, OpManager, Splunk, Dynatrace, AppDynamics, New Relic, Prometheus, Grafana, Nagios, etc.
- Familiarity with cloud-native monitoring across AWS, Azure, GCP, and on-premise data centers.
- Proficiency in scripting/automation (Python, Shell, PowerShell, Ansible).
- Experience with DevOps and cloud-native environments (Kubernetes, Docker, Terraform, CI/CD pipelines).
- Knowledge of database monitoring (SQL and NoSQL).
- Strong analytical and problem-solving skills for proactive detection and resolution.
- Excellent communication and collaboration skills to work across IT Ops, DevOps, Security, and Application teams.
- Experience presenting monitoring insights and observability metrics to executives and stakeholders.
- Solid foundation in networking and Linux administration.
- Experience with Atlassian tooling (Jira, Confluence) preferred.
- Certifications (ITIL, DevOps, AWS, Azure, GCP, Agile, PMP) are a plus.
Experience Level
Mid LevelJob role
Job requirements
About company
Similar jobs you can apply for
Mechanical Engineer
Mechanical Designer
Aira Kiran HR and Labour SolutionsField Engineer
Activline TelecomMIS Coordinator
Jana Small Finance Bank
Design Engineer
Devilog Systems IndiaProject Coordinator
Jana Small Finance Bank