RBC Borealis is hiring a Lead Site Reliability Engineer responsible for the availability, reliability, scalability, and performance of our critical applications. You will balance hands-on production support with proactive initiatives focused on automation and observability.
What You'll Do
- Perform application production support, including off-hours coverage.
- Develop SRE solutions like monitoring, alerting, anomaly detection, self-healing, and reliability testing.
- Monitor availability and system health for the production environment.
- Build software and systems to manage infrastructure and applications, improving reliability and time-to-market.
- Assist in incident and problem management.
- Maintain technology currency and automate repetitive tasks.
- Ensure application availability meets service level objectives.
- Ensure compliance and maintain segregation of duties for systems in scope.
- Implement comprehensive monitoring, alerting, and self-healing capabilities.
- Detect, diagnose, and resolve incidents; analyze problems; and manage required changes.
- Implement service level indicators and objectives for mission-critical applications.
- Ensure compliance with regulatory and security requirements.
- Stay current with emerging technologies to drive innovation and efficiency.
What We're Looking For
- 3+ years of experience in Application Support, Software Development (SDLC), and Operations.
- Strong proficiency in at least two programming languages such as Java, Python, .NET, SQL.
- Understanding of resilient IT solutions and enhancing reliability through automation.
- Experience across varied environments: Linux, Windows, Databases, Cloud, distributed/mainframe, and APIs.
- Hands-on experience with DevOps/SRE tools like Ansible, Dynatrace, Moogsoft, PagerDuty, ServiceNow, Elastic Stack, Jenkins, and others.
- Excellent communication, analytical, and problem-solving skills for complex incidents.
- Effective negotiation and stakeholder management skills.
Nice to Have
- Prior SRE experience in the financial services industry.
- Knowledge of Digital Identity Access Management, Internet/Mobile Banking Platforms, Microservices, Data Services, Test Automation, or Corporate applications.
Technical Stack
- Languages & Data: Java, Python, .NET, SQL, Databases
- Platforms: Linux, Windows, Cloud
- Tools: Ansible, Dynatrace, Moogsoft, PagerDuty, ServiceNow, Elastic, Logstash, Kibana, Logic Monitor, Jenkins, Cucumber, CA Work Automation, Power BI, ETL tools
Benefits & Compensation
- A comprehensive Total Rewards Program including competitive compensation, bonuses, and flexible benefits.
- Continued opportunities for career advancement.
- World-class training, coaching, and development opportunities.
- Support from a dynamic, collaborative, and high performing team.
- Opportunity to achieve significant success and grow your career with RBC.
Work Mode
This role is onsite in Toronto, Canada.
At RBC, we are guided by shared values of Client First, Integrity, Collaboration, Respect and Excellence. We believe an inclusive, diverse workplace is core to our growth and are committed to maintaining a respectful environment where all employees can perform at their best, collaborate effectively, and drive innovation.

