At Cohere, our mission is to scale intelligence to serve humanity. We are looking for a Member of Technical Staff for the Integration/RL Team, a Research Engineer role focused on enhancing the global quality of our post-training codebase. You will implement new tools, optimize algorithms, and scale distributed RL to build a robust training ecosystem.

What You'll Do

Design and write high-performing, scalable software for training large language models.
Develop new tools to support and accelerate research and LLM training.
Coordinate with engineering teams (Infrastructure, Efficiency, Serving) and scientific teams (Agent, Multimodal) to create a strong, integrated post-training ecosystem.
Craft and implement techniques to improve performance and speed up training cycles across SFT, offline preference, and RL regimes.
Research, implement, and experiment with ideas on our cluster and data infrastructure.
Collaborate with scientists, engineers, and teams across the company.

What We're Looking For

Extremely strong software engineering skills.
A value for test-driven development, clean code, and reducing technical debt.
Proficiency in Python and related ML frameworks such as JAX, PyTorch, and/or XLA/MLIR.
Experience using and debugging large-scale distributed training strategies, including memory and speed profiling.

Nice to Have

Experience with distributed training infrastructures like Kubernetes and associated frameworks like Ray.
Hands-on experience with the post-training phase of model training, with a strong emphasis on scalability and performance.
Experience in ML, LLM, and RL academic research.

Technical Stack

Python, JAX, PyTorch, XLA/MLIR, Kubernetes, Ray

Team & Environment

You will be part of the Integration team, responsible for developing and scaling machine learning algorithms and infrastructure for LLM post-training.

Benefits & Compensation

Weekly lunch stipend, in-office lunches & snacks.
Full health and dental benefits, including a separate budget for mental health.
100% Parental Leave top-up for 6 months for employees in Canada, the US, and the UK.
Personal enrichment benefits for arts and culture, fitness and well-being, quality time, and workspace improvement.
Remote-flexible work; offices in Toronto, New York, San Francisco, and London; co-working stipend.
6 weeks of vacation.

Work Mode

This is a hybrid role open to candidates in London, Paris, Toronto, San Francisco, and New York.

We value and celebrate diversity and strive to create an inclusive work environment for all. We welcome applicants from all backgrounds and are committed to providing equal opportunities.