Odyssey is looking for an Infrastructure Engineer to build and operate the compute and infrastructure foundation that enables our groundbreaking AI research and products. We are motivated by building for the frontier and shaping the compute foundation of a lab redefining media creation and interaction.
What You'll Do
- Develop and operate a low-latency model inference platform for high availability, scalability, and resource efficiency.
- Engineer and scale core data processing infrastructure for petabyte-scale datasets.
- Design, build, and maintain large-scale, GPU-based training clusters for deep learning, focusing on usability, throughput, and reliability.
- Automate infrastructure provisioning, configuration, monitoring, and alerting using Infrastructure as Code principles.
- Drive performance tuning, cost optimization, and reliability improvements across the entire stack.
- Collaborate closely with researchers and product developers to understand requirements, optimize workflows, and improve platform usability.
What We're Looking For
- Strong programming skills in languages like Python or Go.
- Solid understanding of software engineering best practices.
- Deep, hands-on experience with containerization (Docker), container orchestration (Kubernetes) and Infrastructure as Code (Terraform).
- Proven experience building and managing large-scale, distributed systems with GPU computational workloads.
- Experience designing infrastructure for ML workloads where performance, parallelism, and data movement are critical.
- A collaborative mindset and excellent communication skills.
- Passion for building developer-friendly platforms.
Technical Stack
- Languages: Python, Go
- Infrastructure & Orchestration: Docker, Kubernetes, Terraform
- Data Processing: Flyte, Ray
Odyssey is an equal opportunity employer.





