San Francisco, CA | New York City, NY Hybrid Employment $405,000 - $485,000 USD

Anthropic is hiring an Engineering Manager, Inference Routing and Performance

Requirements

  • 5+ years of engineering management experience, ideally with at least part of that leading teams on critical-path production infrastructure at scale
  • deep systems background — load balancing, scheduling, cache-coherent distributed state, high-performance networking, or similar. You need enough depth to make architectural calls about routing and efficiency, and to evaluate candidates who go to the kernel and framework level
  • shipped performance improvements in large-scale systems and can explain, with numbers, what the impact was
  • run production infrastructure with real operational stakes: on-call, incident response, capacity events, deploy discipline
  • results-oriented with a bias toward impact, and comfortable working in a space where throughput, latency, stability, and feature velocity all pull in different directions
  • build strong relationships across team boundaries — this is a seam role, and much of the job is making sure other teams can rely on yours
  • curious about machine learning systems. You don't need an ML research background, but you should want to learn how transformer inference actually works and how that shapes the systems problems

Nice to Have

  • Experience with LLM inference serving — KV caching, continuous batching, request scheduling, prefill/decode disaggregation
  • Background in cluster schedulers, load balancers, service meshes, or coordination planes at scale
  • Familiarity with heterogeneous accelerator fleets (GPU/TPU/Trainium) and how hardware differences affect workload placement
  • Experience with GPU/accelerator programming, ML framework internals, or OS-level performance debugging — enough to follow and evaluate the technical work, not necessarily to do it daily
  • Led teams at supercomputing or hyperscaler infrastructure scale
  • Led teams through rapid-growth periods where hiring and onboarding competed with roadmap delivery
Required Skills
LLM inference servingheterogeneous accelerator fleetsGPU/accelerator programmingML framework internalsor OS-level performance debugging LLM inference servingheterogeneous accelerator fleetsGPU/accelerator programmingML framework internalsor OS-level performance debugging
Earn more as a remote developer

Performance pay that rewards your skills

Iglu's revenue-sharing model means top performers earn significantly more than traditional salaries. Choose your projects, deliver great work, and see it reflected in your pay.

Revenue-sharing compensation
Project choice & autonomy
International client base
Career growth support
Check compensation
Top earners exceed market rate
About company
Anthropic
Anthropic’s mission is to create reliable, interpretable, and steerable AI systems. We want AI to be safe and beneficial for our users and for society as a whole.
All jobs at Anthropic Visit website
Job Details
Department Inference Routing
Category management
Posted 2 hours ago