Responsibilities
- Own the real-time API layer (WebSocket + HTTP streaming) that powers Together's voice platform.
- Design autoscaling and orchestration for voice workloads running on tens of thousands of GPUs.
- Build the developer experience — APIs, observability, and tooling — for a fast-growing product area.
- Work with production voice customers (contact centers, AI agents, communication platforms) to ship what they actually need.
- Join a small, early-stage team with outsized impact on a new product line.
Requirements
- 5+ years of experience building large-scale, real-time distributed systems and API services.
- Deep expertise in real-time streaming infrastructure — WebSocket server architecture, Server-Sent Events, bidirectional streaming, connection multiplexing, and stateful protocol design.
- Expert-level programming in TypeScript and Python; experience with Rust is a plus.
- Strong distributed systems fundamentals: load balancing, autoscaling, rate limiting, and traffic shaping for latency-sensitive workloads.
- Experience with Kubernetes — including custom autoscalers, resource management, and health checking for stateful services.
- Strong product sense — you care about API ergonomics and think about what developers building voice apps actually need.
- Comfort working on a small, early-stage team where you'll wear multiple hats and move fast.
- Bachelor's or Master's degree in Computer Science, Computer Engineering, or related field, or equivalent practical experience.
Nice to Have
- Experience with audio or media protocols (WebRTC, g711, PCM encoding) is a strong plus.
- Familiarity with ML model serving infrastructure and how inference engines work is a plus — you'll interface with the serving layer regularly.
- Full-stack experience (React, Next.js) is a nice-to-have for contributing to developer-facing tooling.
Team
Team size: small. Structure: high-impact team