Red Hat seeks a Senior Site Reliability Engineer (SRE) to join us remotely in British Columbia or Alberta, Canada. You will design, scale, and operate our managed cloud services built on Red Hat OpenShift, tackling complex, large-scale distributed systems challenges unique to our offerings.
What You'll Do
- Contribute code to increase the scalability and reliability of the service
- Contribute software tests and participate in peer review to increase codebase quality
- Help develop peers’ capabilities through knowledge sharing, mentoring, and collaboration
- Participate in a regular on-call schedule, including occasional paid weekends and holidays
- Practice sustainable incident response and blameless postmortems
- Resolve customer issues escalated from the Red Hat Global Support team
- Work within a small agile team to develop SRE software, support your peers, and plan improvements
- Explore and experiment with emerging AI technologies relevant to software development, proactively identifying opportunities to incorporate new AI capabilities into workflows and tooling
What We're Looking For
- Bachelor's degree in Computer Science or related field, or equivalent experience
- 5+ years of experience managing Linux servers (RHEL, CentOS, or Fedora) hosted at a cloud provider like AWS, GCE, or Microsoft Azure
- 3+ years of experience with enterprise systems monitoring; knowledge of Prometheus is a plus
- 3+ years of experience with enterprise configuration management software like Ansible, Puppet, or Chef
- 2+ years of experience programming with at least one object-oriented language; Golang, Java, or Python preferred
- 2+ years of experience delivering a hosted service
- Demonstrated ability to quickly and accurately troubleshoot system issues
- Solid understanding of standard TCP/IP networking and common protocols like DNS and HTTP
- Solid communications skills and experience working directly with and presenting to customers
Nice to Have
- 1+ year of experience with Kubernetes
- 1+ year of experience with docker-based containers
Technical Stack
- Linux: RHEL, CentOS, Fedora
- Cloud Providers: AWS, GCE, Microsoft Azure
- Monitoring: Prometheus
- Configuration Management: Ansible, Puppet, Chef
- Languages: Golang, Java, Python
- Containers & Orchestration: Kubernetes, Docker-based containers
Team & Environment
You will join a small agile team within a globally distributed SRE organization. We value collaboration, openness, transparency, diverse perspectives, and a blameless culture focused on continuous improvement. We embrace change in a fast-evolving technology landscape and approach our work with a growth mindset, actively encouraging the responsible and thoughtful use of AI to streamline workflows, reduce complexity, and improve efficiency.
Work Mode
This is a remote position open to candidates residing in British Columbia or Alberta, Canada.
Red Hat is proud to be an equal opportunity workplace and an affirmative action employer. We review applications for employment without regard to their race, color, religion, sex, sexual orientation, gender identity, national origin, ancestry, citizenship, age, veteran status, genetic information, physical or mental disability, medical condition, marital status, or any other basis prohibited by law.





