Responsibilities
- Lead and develop a high-performing distributed team - Manage and coach ~12 L2/L3 Support Engineers including: hiring, onboarding, coaching, performance, and operational rigor. - Establish/own cadences such as: daily/regular triage, weekly KPI reviews, monthly ops reviews with tribe leadership, and a quarterly improvement roadmap.
- Own the support operating model, process, and governance - Own the L1–L4 support model: triage/routing, escalation, SLAs/OLAs, severities and comms. - Drive cross-functional improvements (process, tooling, knowledge, automation) and ensure consistent execution across tribes.
- Drive metrics, transparency, and continuous improvement - Build and run an operational KPI system, dashboards, and cadences; ensure leadership visibility into service health, risks, and trends including: Time to first response, resolution time, SLA/OLA compliance, Backlog size/aging, inflow vs outflow, Reopen rate, escalation rate, handoff cycles, assignment accuracy and quality of analysis/diagnosis
- Observability, monitoring, runbooks, and operational readiness - Improve observability: alerting/monitoring quality & coverage, and runbooks/playbooks (with Engineering/Platform). - Strengthen diagnostics and knowledge management to reduce repeat issues and accelerate L4 resolution.
- Stakeholder management and alignment across tribes/squads - Partner with Tribe leadership, Engineering Managers, Tech Leads, Product, and Customer-facing teams to align priorities and execute improvements. - Ensure Support Engineering and Squads operate “in perfect harmony,” with shared accountability and minimal friction between L2/L3 and L4.
Requirements
- 8+ years of experience in Support Engineering / Application Support / SaaS Operations (complex web + APIs), with 3+ years of leading support and driving performance across distributed, multi-cultural teams.
- Proven experience designing and implementing L1–L4 support operating models, including standardized triage/routing, escalation paths, and measurable improvements in observability/monitoring (alerts, detection coverage, runbooks).
- Strong track record improving operational KPIs: including, but not only, time to first response, resolution time, backlog health, SLA/OLA compliance and breach reduction.
- Strong incident and problem management experience, including RCAs and prevention loops.
- Excellent stakeholder management and communication skills, able to drive alignment across Customer Care, Support, and Engineering tribes/squads.
Nice to Have
- Experience with Salesforce (case intake / customer care), Jira (work tracking / execution), and Azure DevOps (engineering tracking/defects, where applicable).
- Experience with distributed Microsoft/.NET and/or Azure-based microservice environments.
Work Arrangement
Hybrid
Team
Team size: 12. Structure: distributed
Additional Information
- operating a follow-the-sun model (24/5) at tribe level
