London, UK Hybrid Employment £370,000 - £630,000 GBP

Anthropic is hiring a Research Engineer, Machine Learning (RL Velocity)

About the Role

The role involves building and refining machine learning systems with a focus on reinforcement learning, contributing to core research initiatives, and accelerating model development cycles.

Responsibilities

  • Design and implement reinforcement learning algorithms
  • Optimize training pipelines for efficiency and scalability
  • Collaborate with researchers to prototype new model architectures
  • Evaluate model performance using rigorous testing frameworks
  • Debug and improve system-level issues in training infrastructure
  • Contribute to codebases supporting large-scale experiments
  • Analyze training dynamics to inform research direction
  • Support deployment of experimental models into test environments
  • Refactor and maintain core machine learning software modules
  • Instrument systems to collect detailed training metrics
  • Improve data throughput in distributed training setups
  • Integrate feedback from safety and alignment teams
  • Ensure code meets performance and reliability standards
  • Document technical designs and implementation details
  • Assist in benchmarking against prior state-of-the-art methods
  • Work closely with infrastructure engineers on system improvements
  • Help define best practices for experimental tracking
  • Contribute to internal tools for model analysis
  • Support reproducibility of research results
  • Participate in technical discussions on algorithm design
  • Monitor training runs for anomalies or regressions
  • Optimize resource utilization across compute clusters
  • Implement safeguards for stable training behavior
  • Translate research concepts into functional code
  • Iterate rapidly based on empirical results

Nice to Have

  • Advanced degree in a relevant field
  • Prior work in reinforcement learning research
  • Contributions to open-source machine learning projects
  • Experience with transformer-based architectures
  • Knowledge of safety evaluation methods
  • Publication record in machine learning venues
  • Hands-on experience with large model training
  • Familiarity with formal methods in AI safety
  • Background in systems engineering for ML
  • Experience mentoring junior engineers
  • Understanding of ethical implications in AI development
  • Involvement in interdisciplinary research teams

Compensation

Competitive salary and equity package

Work Arrangement

Hybrid or remote options available

Team

Part of a research-focused machine learning team advancing reinforcement learning systems

Research Focus

  • Work will center on improving reinforcement learning systems with an emphasis on stability, scalability, and alignment with intended behavior.
  • Engineers will engage in both theoretical exploration and practical implementation to advance core capabilities.

Impact

  • Contributions will directly influence the development of safer and more controllable AI systems.
  • Work will support long-term research goals in reliable machine learning.

Available for qualified candidates

Relocating to Thailand?

Visa and work permit handled by experts

SVBL manages your entire visa process — from application to approval. Work permits, extensions, and compliance all covered. One partner for legal, immigration, and settling in.

Work permit processing
Visa extensions & renewals
Immigration compliance
Banking & housing guidance
Get free consultation
Free initial consultation
About company
Anthropic
Anthropic’s mission is to create reliable, interpretable, and steerable AI systems. We want AI to be safe and beneficial for our users and for society as a whole.
All jobs at Anthropic Visit website
Job Details
Department RL Velocity
Category other
Posted 2 hours ago