What You'll Do
Take full ownership of technical support by serving as the primary responder for escalated issues from customer success. You'll investigate, triage, and determine whether reported problems stem from configuration, user error, or actual defects—then ensure confirmed bugs are passed to engineering with clear reproduction steps and impact analysis.
Proactively monitor platform stability by reviewing alerts, dashboards, and the health of customer environments. Your goal is to detect and resolve issues before they affect users, minimizing disruption and improving system reliability.
Act as a critical link between customers and engineering teams. Translate real-world user feedback into precise technical insights, filter out noise, and ensure urgent matters receive immediate attention without overwhelming developers.
Build the foundation of the support function from the ground up. Design triage workflows, define internal service level agreements, establish escalation protocols, and create a living knowledge base. Implement a structured ticketing system to replace informal messaging, complete with severity classifications and response time standards.
Conduct a thorough review of existing monitoring tools and alerting systems. Identify blind spots, refine alert thresholds, and implement daily health checks across all customer instances to catch problems early.
Clear a backlog of open technical issues by investigating root causes, resolving them efficiently, and documenting findings to prevent repeat escalations.
Requirements
- Minimum of 2 years in technical support, support engineering, or SRE roles at a software company
- Proficiency reading logs, navigating Linux systems, and diagnosing issues within application infrastructure
- Familiarity with containerization technologies such as Docker and Kubernetes, including deployment in cloud or on-prem environments
- Strong written communication skills—able to document bugs clearly so engineers can act without follow-up
- Proven ability to work independently in fast-paced, high-responsibility startup settings
- Hands-on mindset: you prefer investigating an issue directly rather than passing it along
- Calm under pressure and committed to fully resolving issues from start to finish
Preferred Qualifications
- Background in utility IT, energy systems, or other regulated sectors
- Experience with observability platforms like PagerDuty, Datadog, or BetterUptime
- Exposure to supporting on-premises or air-gapped deployments
- Prior work in customer-facing technical roles at early-stage startups
Technical Stack
- Docker, Kubernetes, Linux
- PagerDuty, Datadog, BetterUptime
Benefits
- Unlimited paid time off
- Comprehensive health, dental, and vision insurance
- 4% 401(k) matching
Work Mode
This is a hybrid role based at the Phoenix headquarters. The schedule is primarily in-office Monday through Friday, with Wednesday designated as a remote day. Employees are expected to work on-site at least 80% of the time.
Eligibility
Applicants must be U.S. citizens or permanent residents due to Department of Energy export compliance regulations.
