Responsibilities
- Educate and assist product teams on reliability best practices, implementation strategies, and efficient use of current platform tools
- Enable product teams to enhance system performance and uptime
- Engage directly in coding and infrastructure work to support reliability enhancements
- Deliver detailed insights to the Platform group on necessary infrastructure improvements based on direct codebase experience
- Support the development of proof-of-concept initiatives within product teams to advance deployment architecture in line with business expansion, scalability, and resilience goals
- Work closely with product teams during feature and service launches to ensure compliance with reliability and performance benchmarks
- Advise teams on designing systems that remain resilient and fail gracefully under high load
- Support application teams after incidents, including follow-up actions and contributing to post-mortem reports
- Examine incident data to detect patterns and recommend improvements, sharing findings with product teams
- Tackle complex technical challenges with high potential for innovation in the energy sector
Team
Part of the Reliability group responsible for architecting, developing, and maintaining resilient and scalable infrastructure. The 'Product Reliability' team is newly created.