Responsibilities
- Own the model serving stack that powers Together's voice platform across STT, TTS, and speech-to-speech.
- Work directly with state-of-the-art accelerators (H100s, H200s, B200s) to optimize voice model inference.
- Collaborate with model partners (Cartesia, Deepgram, Rime, and others) to bring their models to production on Together's infrastructure.
- Build quality evaluation frameworks that guide model selection for customers and inform the roadmap.
- Join a small, early-stage team with outsized impact on a fast-growing product area.
Requirements
- 5+ years of experience in ML engineering, with a focus on model serving, inference optimization, or ML infrastructure.
- Hands-on experience with LLM serving engines (vLLM, SGLang, TensorRT-LLM, or similar) — comfortable reading and modifying engine internals, not just using APIs.
- Strong proficiency in Python and PyTorch; experience with GPU profiling and optimization (CUDA, memory management, kernel-level debugging).
- Track record of shipping ML systems to production with measurable performance improvements.
- Strong product sense — you think about what developers building voice apps actually need, not just what's technically interesting.
- Comfort working on a small, early-stage team where you'll wear multiple hats and move fast.
- Bachelor's or Master's degree in Computer Science, Electrical Engineering, or related field, or equivalent practical experience
Nice to Have
- Experience with speech and audio ML (ASR, TTS architectures, audio signal processing) is a strong plus but not required — you can learn this quickly if you have strong ML engineering fundamentals.
- Familiarity with audio codecs and tokenization schemes (SNAC, Encodec, DAC) is a plus.
- Experience training or fine-tuning speech models is a plus.
Team
Team size: small. Structure: high-impact team
Additional Information
- Compensation: We offer competitive compensation, startup equity, health insurance and other competitive benefits. The US base salary range for this full-time position is: $200,000 - $260,000 + equity + benefits. Our salary ranges are determined by location, level and role. Individual compensation will be determined by experience, skills, and job-related knowledge.