Responsibilities
- Direct the full lifecycle resolution of significant incidents by coordinating technical teams' response activities.
- Take accountability for critical incidents and lead coordination to resolve high-impact events swiftly.
- Recognize and eliminate obstacles, escalate when necessary, and maintain progress during troubleshooting.
- Ensure compliance with defined incident management procedures and standards.
- Help refine incident response playbooks and supporting documentation.
- Manage communication flows internally and externally during major incidents.
- Convert technical information into clear business impact terms, including scope, severity, risk, estimated resolution time, and confidence level.
- Provide consistent, timely updates to stakeholders throughout incident lifecycles.
- Oversee the safe implementation of remediation steps, rollbacks, feature toggles, and system failovers.
- Facilitate post-incident reviews with stakeholders to verify incident details and assign root cause analysis leads.
- Monitor and report key incident metrics to detect trends and opportunities for systemic enhancements.
- Support Change and Problem Management functions as needed in carrying out their duties.