Responsibilities
- Invent, design and implement RL environments and evaluations.
- Conduct experiments and shape our research roadmap.
- Deliver your work into training runs.
- Collaborate with other researchers, engineers, and performance engineering specialists across and outside Anthropic.
Requirements
- expertise with accelerators (CUDA, ROCm, Triton, Pallas)
- ML framework programming (JAX or PyTorch)
- worked across the stack – kernels, model code, distributed systems
- balance research exploration with engineering implementation
- passionate about AI's potential and committed to developing safe and beneficial systems
Nice to Have
- Experience with reinforcement learning.
- Experience porting ML workloads between different types of accelerators.
- Familiarity with LLM training methodologies.
Benefits
- competitive compensation and benefits
- optional equity donation matching
- generous vacation and parental leave
- flexible working hours
- lovely office space in which to collaborate with colleagues
Team
Structure: Reinforcement Learning teams
Additional Information
- Your safety matters to us. To protect yourself from potential scams, remember that Anthropic recruiters only contact you from @anthropic.com email addresses. In some cases, we may partner with vetted recruiting agencies who will identify themselves as working on behalf of Anthropic. Be cautious of emails from other domains. Legitimate Anthropic recruiters will never ask for money, fees, or banking information before your first day. If you're ever unsure about a communication, don't click any links—visit anthropic.com/careers directly for confirmed position openings.