Remote Remote (Country) Employment £60,000 - £70,000

Arbor Education is hiring a Site Reliability Engineer

About the Role

The role involves maintaining and enhancing the reliability of production systems by combining engineering expertise with operational focus. You will design automated solutions, respond to incidents, and contribute to scalable infrastructure.

Responsibilities

  • Monitor system performance and respond to alerts promptly
  • Design and implement automation to reduce manual operations
  • Collaborate with development teams to improve code deployability
  • Maintain high availability and uptime for critical services
  • Troubleshoot complex production issues across distributed systems
  • Develop tools to streamline incident response and resolution
  • Participate in on-call rotations with support from the team
  • Optimize system performance and resource utilization
  • Enforce best practices in configuration management
  • Contribute to disaster recovery planning and execution
  • Ensure systems meet defined service level objectives
  • Drive improvements in observability and monitoring coverage
  • Support secure and reliable deployment pipelines
  • Document system architecture and operational procedures
  • Evaluate new technologies for operational efficiency
  • Implement proactive alerting to prevent outages
  • Conduct root cause analysis after incidents
  • Promote a blameless post-mortem culture
  • Assist in capacity planning for future growth
  • Integrate security practices into system design
  • Maintain cloud infrastructure configurations
  • Ensure compliance with operational standards
  • Collaborate on scalability challenges
  • Refine incident escalation protocols
  • Support platform audits and reviews

Nice to Have

  • Experience in an education technology environment
  • Background in large-scale distributed systems
  • Familiarity with Kubernetes in production
  • Knowledge of Terraform or similar IaC tools
  • Experience with service mesh technologies
  • Exposure to zero-downtime deployment strategies
  • Understanding of SRE principles and error budgets
  • Prior work with observability platforms like Datadog
  • Involvement in platform security initiatives
  • Track record of mentoring junior engineers

Compensation

Competitive salary and benefits package

Work Arrangement

Hybrid or remote options available

Team

Collaborative engineering team focused on platform stability and performance

Our Engineering Culture

  • We value transparency, ownership, and continuous improvement in our technical practices.
  • Engineers are encouraged to propose and lead infrastructure initiatives.

Growth Opportunities

  • You will have access to learning budgets and time for professional development.
  • Opportunities to grow into technical leadership roles are supported.

Available for qualified candidates

About company
Arbor Education
Arbor is a company on a mission to transform the way schools work for the better. One of their products is SAMpeople, a cloud based software platform for schools to give them a one-stop shop for their people management.
All jobs at Arbor Education Visit website
Job Details
Department Engineering
Category infrastructure
Posted 2 days ago