5+ years of demonstrated experience building large-scale, fault-tolerant, distributed systems and API microservices.
Strong background in designing, analyzing, and improving efficiency, scalability, and stability of complex systems.
Excellent understanding of low-level OS concepts: multi-threading, memory management, networking, and storage performance.
Expert-level programming in one or more of: Rust, Go, Python, or TypeScript.
Bachelor’s or Master’s degree in Computer Science, Computer Engineering, or related field, or equivalent practical experience.

Knowledge of modern LLMs and generative models and how they are served in production is a plus.
Experience working with the open source ecosystem around inference is highly valuable; familiarity with SGLang, vLLM, or NVIDIA Dynamo will be especially handy.
Experience with Kubernetes or container orchestration is a strong plus.
Familiarity with GPU software stacks (CUDA, Triton, NCCL) and HPC technologies (InfiniBand, NVLink, MPI) is a plus.

Together AI is hiring a Senior Backend Engineer, Inference Platform