Elsevier is looking for a Site Reliability Engineer for a developed professional-level role. You will be responsible for challenging reliability and toil reduction projects, leveraging hands-on experience across SRE practices. You will contribute to process improvements, provide informal guidance to junior staff, and participate in on-call rotations.
What You'll Do
- Create monitoring queries and establish service level baselines.
- Support senior engineers during incidents and contribute to post-mortems and RCAs.
- Participate in disaster recovery tests.
- Implement automation and execute code in production environments.
- Contribute to SRE knowledge documentation.
- Support the creation of infrastructure topology drawings and deployment workflows.
- Test availability, reliability, and recoverability in non-production environments.
- Benchmark and document test performance results for production readiness reviews.
- Participate on-call to assist in the recovery of Major Incidents.
- Test system and component failover within and between geographic regions.
- Automate the recovery of systems using Infrastructure-as-Code and Configuration Management scripts.
- Create and present RCAs including executive summaries, timelines, and follow-on actions.
- Lead scenario modeling exercises and create workflows triggered by SLO breaches.
- Provide on-call support for other SRE engineers.
- Write advanced automation scripts for incident response, including failovers and rollbacks.
- Contribute to SRE knowledge base articles and training material.
- Analyze toil by examining ticket trends and recommending team focus areas.
- Work independently on small toil elimination projects.
- Apply deep technical understanding of observability across the full stack to clarify complex incidents.
- Create templated observability dashboards and configuration using code.
- Influence the setting of appropriate SLOs and Error Budgets.
- Work within or manage a cross-functional team in support of migrating applications to standard platforms.
- Provide direction and consultancy to others implementing new Paved Road features or Platforms.
- Analyze and make recommendations to improve the SDLC and CI/CD processes.
- Create actionable reports on the operational health and lifecycle of platform and product components.
What We're Looking For
- Advanced hands-on experience with DevOps practices including monitoring, virtual networks, cloud storage, containers and orchestration, CI/CD, configuration management, and securing cloud applications.
Technical Stack
- Azure (AKS, etc)
- Terraform
- GitHub
- CI/CD
- Java debugging
- Helm charts
- JFrog
- ELK
- Akeyless or Vault
Benefits & Compensation
- U.S. National Base Pay Range: $95,300 - $158,800. Geographic differentials may apply in some locations to better reflect local market rates.
We are an equal opportunity employer: qualified applicants are considered for and treated during employment without regard to race, color, creed, religion, sex, national origin, citizenship status, disability status, protected veteran status, age, marital status, sexual orientation, gender identity, genetic information, or any other characteristic protected by law.





