Responsibilities
- Act as the primary technical liaison for a key customer, managing the full technical relationship across compute, networking, storage, and physical infrastructure
- Establish and lead recurring technical engagements, including status updates, steering sessions, and executive-level business reviews
- Convert customer operational insights into prioritized inputs for Engineering, Product, and Infrastructure planning
- Manage end-to-end resolution of technical issues, including escalation coordination and root cause analysis across infrastructure teams
- Oversee hardware return management and lifecycle operations, including testing, spare parts inventory, and health monitoring for large GPU clusters
- Maintain advanced knowledge of customer infrastructure, including GPU systems, high-speed networks, and scalable storage, providing guidance on setup and troubleshooting
- Develop and manage observability frameworks, including alerting rules, dashboards, and proactive system health oversight across all layers
- Coordinate data center and facilities activities with internal teams and third-party providers to maintain service level agreements and system uptime
- Lead project execution for capacity scaling, managing node deployment from delivery to production readiness
Benefits
- competitive compensation
- startup equity
- health insurance
- other benefits
- flexibility in terms of remote work
Compensation
competitive compensation
Work Arrangement
Hybrid