Remote-Friendly (Travel-Required) | San Francisco, CA | New York City, NY Hybrid Employment $500,000 - $850,000 USD

Anthropic is hiring a Research Engineer, Machine Learning (RL Velocity)

About the Role

The role involves developing and iterating on machine learning models with a focus on reinforcement learning, contributing to both research and engineering efforts to accelerate training pipelines and improve system capabilities.

Responsibilities

  • Design and implement reinforcement learning experiments
  • Optimize training infrastructure for scalability and speed
  • Collaborate on improving model training efficiency
  • Iterate on algorithms to enhance learning stability
  • Develop tools for monitoring and analyzing training runs
  • Support large-scale model training workflows
  • Refine reward modeling techniques
  • Debug and resolve issues in training pipelines
  • Contribute to versioning and reproducibility of experiments
  • Work closely with researchers to prototype new ideas
  • Improve data processing pipelines for training
  • Evaluate model behavior during training phases
  • Integrate feedback mechanisms into learning loops
  • Assist in ablation studies to validate design choices
  • Optimize resource utilization across compute clusters
  • Document methods and results for internal review
  • Ensure consistency across experimental setups
  • Support deployment of training systems in production-like environments
  • Develop automated testing for training components
  • Contribute to cross-team knowledge sharing
  • Analyze training dynamics to inform future iterations
  • Implement safety checks within learning frameworks
  • Refactor code for maintainability and performance
  • Assist in benchmarking against prior approaches
  • Help define success metrics for training objectives

Nice to Have

  • PhD in computer science or related field
  • Prior research in reinforcement learning
  • Contributions to open-source machine learning projects
  • Experience with large language models
  • Work on safety or alignment in AI systems
  • Publications in machine learning venues
  • Familiarity with formal verification methods
  • Experience with policy gradient methods
  • Knowledge of human-in-the-loop training
  • Background in software engineering best practices
  • Experience mentoring junior engineers
  • Work with real-time feedback systems
  • Understanding of ethical AI development
  • Prior role in a research-forward organization
  • Involvement in model interpretability efforts

Compensation

Competitive salary based on experience and location

Work Arrangement

Hybrid work model with office and remote options

Team

Part of a research-focused team advancing core machine learning capabilities

Research Focus

  • Work will center on accelerating reinforcement learning pipelines
  • Emphasis on improving training speed and model reliability
  • Projects aim to reduce iteration time for new ideas

Impact

  • Engineers directly influence model safety and performance
  • Work contributes to foundational improvements in training systems

Available for qualified candidates

Freelancing without stability?

Get steady projects, keep your freedom

Iglu connects you with international clients and handles contracts, payments, and admin. You get consistent work and flexibility — no more chasing invoices or worrying about gaps.

Consistent client projects
Contract & payment management
Flexible work schedule
Revenue-sharing compensation
See open positions
Work from anywhere
About company
Anthropic
Anthropic’s mission is to create reliable, interpretable, and steerable AI systems. We want AI to be safe and beneficial for our users and for society as a whole.
All jobs at Anthropic Visit website
Job Details
Department RL Velocity
Category other
Posted 2 hours ago