Cloud Site Reliability Engineer Architect
Synechron TechnologiesJob Description
Cloud SRE Architect
Site Reliability Engineer (SRE) Architect — SRE Team
SRE Team
We are responsible for ensuring our platform remains stable, scalable, and resilient. The SRE team bridges the gap between development and operations — breaking down silos, empowering developers, and fostering a culture of ownership and continuous improvement.
At the architect level, this role shapes the reliability strategy, platform standards, and engineering practices that enable our systems to scale safely and efficiently across the enterprise.
We build creative, automated, and robust solutions to operational challenges, partnering with product, platform, and engineering teams from early design through to production optimization.
We see the big picture — defining standards, enabling consistency, and cultivating an agile, learning-oriented culture. We follow SRE principles such as blameless postmortems, error budgets, and continuous feedback loops to ensure both system reliability and team sustainability.
Above all, we are passionate about automation, observability, and continuous improvement — operating at scale where reliability is a product feature.
As an SRE Architect, you will:
- Define and drive the enterprise reliability strategy, standards, and reference architectures.
- Architect and evolve highly available, scalable platforms across AWS and container ecosystems.
- Lead the design and governance of SLI/SLO frameworks, error budgets, and reliability KPIs across services.
- Provide technical leadership for container platforms (ECS, Fargate, Kubernetes) and cloud-native workloads.
- Establish and mature incident management practices, including major incident response, post-incident reviews, and operational readiness.
- Design and standardize observability architecture (metrics, logs, traces, RUM, synthetic monitoring).
- Partner with security teams to implement least-privilege IAM models, secure data patterns, and cloud guardrails.
- Drive scalability and performance engineering initiatives across critical services.
- Guide teams on resilience patterns (multi-AZ, multi-region, graceful degradation, circuit breakers).
- Influence platform roadmaps and mentor engineers across SRE, platform, and product teams.
- Champion automation-first thinking across infrastructure provisioning, deployments, and operations.
- Act as a technical escalation point for complex production incidents and systemic reliability risks.
In short — design the systems that keep everything running at scale.
Here’s What You Need:
- 15+ years of relevant experience in Site Reliability Engineering, Platform Engineering, or Cloud Infrastructure roles.
- Deep hands-on expertise with AWS, including core services such as:
- ECS / Fargate
- EKS / Kubernetes
- IAM (advanced policy design and guardrails)
- S3 (security, lifecycle, and large-scale data patterns)
- EC2, Auto Scaling, ALB/NLB, VPC
- Proven experience designing and operating large-scale container platforms (ECS and/or Kubernetes).
- Strong experience implementing SLI/SLO frameworks, error budgets, and reliability governance.
- Demonstrated leadership in incident management and major incident response.
- Deep understanding of observability ecosystems (Datadog, Dynatrace, Prometheus/Grafana, Splunk, or similar).
- Strong Linux and cloud networking fundamentals.
- Experience with infrastructure as code (Terraform, CloudFormation, or equivalent).
- Proficiency in automation and scripting (Python, Bash, or similar).
- Experience driving scalability, performance tuning, and capacity planning initiatives.
- Strong stakeholder management and cross-functional leadership skills.
- Familiarity with Agile/DevOps delivery models.
Nice to Have:
- Experience building or governing platform engineering / IDP (Internal Developer Platform) capabilities.
- Multi-region or multi-cloud architecture experience.
- Experience with cost optimization (FinOps) at scale.
- Exposure to service mesh or advanced traffic management.
- Familiarity with ITSM platforms such as ServiceNow.
SYNECHRON’S DIVERSITY & INCLUSION STATEMENT
Diversity & Inclusion are fundamental to our culture, and Synechron is proud to be an equal opportunity workplace and is an affirmative action employer. Our Diversity, Equity, and Inclusion (DEI) initiative ‘Same Difference’ is committed to fostering an inclusive culture – promoting equality, diversity and an environment that is respectful to all. We strongly believe that a diverse workforce helps build stronger, successful businesses as a global company. We encourage applicants from across diverse backgrounds, race, ethnicities, religion, age, marital status, gender, sexual orientations, or disabilities to apply. We empower our global workforce by offering flexible workplace arrangements, mentoring, internal mobility, learning and development programs, and more.
All employment decisions at Synechron are based on business needs, job requirements and individual qualifications, without regard to the applicant’s gender, gender identity, sexual orientation, race, ethnicity, disabled or veteran status, or any other characteristic protected by law.
Experience Level
Senior LevelJob role
Job requirements
About company
Similar jobs you can apply for
Business DevelopmentInside Sales Specialist
Jobs Pedia
Senior Sales Associate
Kushals Retail Pvt Ltd
Human Resource Executive
Great Indian Career AcademyStaff Nursing
IVF AccessTelemarketing Manager
Valued Consulting