Akamai Technologies seeks a Manager of Site Reliability Engineering to lead and grow a team responsible for the availability, reliability, scalability, and performance of our critical systems. In this role, you will balance hands-on technical leadership with people management, operational excellence, and cross-functional collaboration.
What You'll Do
- Lead, mentor, and build a team of Site Reliability Engineers, fostering a culture of reliability, ownership, and continuous improvement.
- Define and drive the vision, strategy, and roadmap aligned with business and engineering goals.
- Establish and manage key performance & reliability metrics.
- Oversee monitoring, alerting, and incident management practices to ensure rapid detection and resolution of issues.
- Lead major incident response efforts, post-incident reviews, and drive corrective and preventive actions.
- Partner with engineering, product, and infrastructure teams to influence system design and reliability best practices.
- Identify operational risks and scalability challenges, and proactively drive long-term solutions.
- Promote automation and build operational efficiency to reduce manual interventions and improve system stability.
What We're Looking For
- At least 10+ years experience in Site Reliability Engineering or DevOps with prior experience in leading/managing teams.
- Strong understanding of Linux/Unix systems and networking fundamentals.
- Strong understanding of distributed systems, cloud infrastructure, and reliability engineering principles.
Team & Environment
You will be part of the Global Services Engineering Team.





