About the Role

Build and maintain systems that empower researchers to iterate quickly on machine learning models by delivering robust, efficient tooling and infrastructure.

Responsibilities

Develop core infrastructure for training and evaluating machine learning models
Create tools that streamline experimental workflows for research teams
Optimize performance and scalability of distributed training systems
Collaborate with researchers to understand system requirements
Design abstractions that simplify complex ML workflows
Improve debugging and monitoring capabilities for training jobs
Maintain reliability and efficiency across compute clusters
Implement versioning and reproducibility features for experiments
Support integration of new hardware into existing pipelines
Troubleshoot low-level system issues affecting model training
Ensure compatibility across software and hardware configurations
Contribute to documentation and internal tooling standards
Evaluate new technologies for potential adoption in research stack
Automate repetitive tasks in the research development cycle
Work closely with software engineers to align tooling with research needs
Enhance data handling pipelines for faster model input
Build interfaces between research code and production systems
Monitor system usage patterns to guide infrastructure improvements
Support secure access to sensitive model assets
Refactor legacy systems to improve maintainability
Develop APIs for internal research tools
Instrument systems for performance measurement and analysis
Assist in capacity planning for compute resources
Participate in code reviews and system design discussions
Ensure tools meet evolving research demands

Nice to Have

Advanced degree in computer science or related field
Direct experience with large-scale model training
Contributions to open-source ML projects
Background in high-performance computing
Experience with reinforcement learning systems
Knowledge of formal verification methods
Familiarity with safety-critical software development
Work with experimental programming languages
Research publications in systems or ML conferences
Experience in startup or research lab environments

Compensation

Competitive salary and benefits package offered

Work Arrangement

Hybrid or remote work options available

Team

Part of a research-focused engineering team building advanced AI systems

Research Culture

Work in an environment that values rigorous inquiry and methodical development
Engage with interdisciplinary teams exploring AI safety and capabilities
Contribute to long-term research goals with real-world impact

Technology Stack

Use modern ML frameworks and custom tooling for model development
Work with GPU clusters and distributed training infrastructure
Leverage internal systems for experiment tracking and analysis

Visa sponsorship may be available for qualified candidates

Anthropic is hiring a Machine Learning Systems Engineer, Research Tools

About the Role

Responsibilities

Nice to Have

Compensation

Work Arrangement

Team

Research Culture

Technology Stack

Similar Jobs

Real Estate Showing Agent

Research Engineer, Model Evaluations

Senior Manager, Information Security Office Consultant

Test Technician

Privacy Research Engineer, Safeguards

Technical Project Manager - Data Center Deployment