This role is for a skilled engineer focused on managing and resolving high-severity technical incidents. You will serve as a central point of contact when critical issues arise, guiding teams through diagnosis, mitigation, and resolution. Your work ensures systems return to full operation quickly and with minimal impact.
Key Responsibilities
- Lead real-time response to critical system alerts and service disruptions
- Coordinate communication between engineering, support, and product teams during incidents
- Document root causes and drive follow-up actions to prevent recurrence
- Refine escalation procedures and improve monitoring effectiveness
- Identify patterns in recurring issues and recommend systemic improvements
Qualifications
Successful candidates will have a background in systems engineering, operations, or technical support, with demonstrated experience managing complex technical escalations. Strong analytical thinking, clear communication under pressure, and familiarity with incident management frameworks are essential. Proficiency with monitoring tools and cloud infrastructure is required.