Pythian is looking for a Site Reliability Consultant to help our clients build and maintain robust, scalable systems. You will apply SRE principles to improve reliability, performance, and efficiency across diverse technical environments.
What You'll Do
- Design, implement, and manage scalable and reliable infrastructure solutions
- Establish and monitor Service Level Objectives (SLOs) and Service Level Indicators (SLIs)
- Automate operational tasks to improve system efficiency and reduce manual toil
- Conduct post-incident reviews and implement improvements to prevent recurrence
- Collaborate with client engineering teams to embed SRE practices into their development lifecycle
- Provide expert guidance on cloud architecture, performance optimization, and disaster recovery
What We're Looking For
- Proven experience in a Site Reliability Engineering, DevOps, or similar infrastructure-focused role
- Strong background in designing and supporting production systems in cloud environments
- Expertise in infrastructure as code using tools like Terraform, Ansible, or CloudFormation
- Proficiency with monitoring, observability, and alerting tools
- Experience with containerization and orchestration technologies
- Ability to analyze complex systems, diagnose problems, and implement effective solutions
- Excellent communication and client-facing consulting skills
Work Mode
This is a remote position open to candidates in Canada, with a preference for those able to work effectively within EST, CST, or PST time zones.
Pythian is an equal opportunity employer.



