Accenture is seeking a Service Management Lead to oversee the delivery of programs, projects, and managed services. In this role, you will act as a Site Reliability Engineer (SRE), ensuring systems are stable, scalable, and highly available while coordinating projects and cultivating key stakeholder relationships.
What You'll Do
- Lead the delivery of programs, projects or managed services.
- Coordinate projects through contract management and shared service coordination.
- Develop and maintain relationships with key stakeholders and sponsors.
- Monitor and optimize system uptime, latency, and throughput to meet SLOs and SLIs.
- Lead incident response, manage escalations, perform root cause analysis (RCA), and drive postmortem reviews.
- Develop CI/CD pipelines, automate infrastructure management, and eliminate manual toil through scripting and orchestration.
- Implement metrics, logging, and tracing frameworks to gain real-time visibility into distributed systems.
- Conduct resource forecasting, design scalable infrastructure, and handle performance under surge conditions.
- Partner with developers to ensure safe, reliable rollout of new features with automated testing and rollback mechanisms.
- Implement multi-region resilience strategies, chaos tests, and failover automation for business continuity.
- Use post-incident analytics to refine operational practices and improve reliability.
- Collaborate with product, design, ML, and DevOps teams to build intelligent workflows and user experiences.
- Implement Infrastructure as Code (IaC) using tools like Terraform, CloudFormation, AZURE DEV OPS, or Pulumi.
What We're Looking For
- Expertise in Site Reliability Engineering.
- Minimum 3 years of overall professional experience.
- Minimum 5 years of experience specifically in Site Reliability Engineering.
- Expertise in Python, Go, Bash, or JavaScript for automation and tooling.
- Hands-on with cloud environments AWS, Azure, GCP and orchestration tools like Kubernetes and Terraform.
- Deep understanding of Linux systems, networking, and distributed architectures.
- Experience with observability solutions Prometheus, Grafana, Datadog, CloudWatch, or New Relic.
- Familiarity with incident management and alerting platforms (PagerDuty, xmatters).
- Proficiency in CI/CD frameworks such as Jenkins, GitHub Actions, or GitLab CI.
- Working knowledge of security, compliance, and performance optimization for highly available systems.
- Expert in Cloud IaaS and PaaS services.
Nice to Have
- Expertise in Python (Programming Language).
- Deep experience with Kubernetes.
- Knowledge of AWS AI Services.
- Certifications: AWS Certified Solutions Architect Professional, Microsoft Certified: Azure Solutions Architect Expert, Google Professional Cloud Architect, Certified Kubernetes Administrator (CKA), HashiCorp Certified: Terraform Associate, or Certified DevOps Engineer certifications from AWS, Azure, or Google.
Technical Stack
- Languages/Scripting: Python
- Orchestration: Kubernetes, Terraform, CloudFormation, Pulumi
- Cloud Platforms: AWS (including AI Services), Azure, GCP
- Observability: Prometheus, Grafana, ELK, Datadog, CloudWatch, New Relic
- Incident Management: PagerDuty, xmatters
- CI/CD: Jenkins, GitHub Actions, GitLab CI
- IaC Tools: Terraform, CloudFormation, AZURE DEV OPS, Pulumi
Work Mode
This is an onsite position based in Bengaluru.
We believe that no one should be discriminated against because of their differences. All employment decisions shall be made without regard to age, race, creed, color, religion, sex, national origin, ancestry, disability status, military veteran status, sexual orientation, gender identity or expression, genetic information, marital status, citizenship status or any other basis as protected by applicable law.




