Responsibilities
- Own and evolve our edge and cloud infrastructure across Cloudflare, Google Cloud, and Vercel.
- Scale and operate our data layer including Spanner, ClickHouse, and Postgres.
- Ensure we are optimizing for performance when serving LLM inference as traffic rapidly grows.
- Partner with engineering leadership on capacity, reliability, and cost across the routing layer, with ownership of the systems carrying production traffic.
- Set the bar and playbook for how we run infrastructure and operations as the team grows — tooling, observability, on-call, and the patterns other engineers build against.
Requirements
- 5+ years building and operating production infrastructure at companies where uptime, latency, and cost matter.
- Proven experience with cloud platforms (GCP, AWS, Azure) and edge-first serverless platforms (e.g. Cloudflare Workers).
- Deep expertise in operating large scale databases (e.g Postgres, Spanner, etc).
- A full-stack TypeScript shop won't faze you; you can move across the stack when the platform needs it.
- High agency and a bias toward action. You don't wait for tickets — you see the bottleneck and fix it.
- AI-forward in your workflow. You use coding agents, MCPs, and LLMs heavily and have opinions about what works.
- Pragmatic about tradeoffs between speed and simplicity.
Nice to Have
- Existing user of OpenRouter, or active side projects in AI products/infrastructure or developer tooling.
Benefits
- Base salary for this full-time position in the United States ranges between $215,000 to $285,000 plus benefits & equity.
- Compensation for internationally based candidates will vary to reflect local market conditions.
Work Arrangement
Remote (Worldwide)
Additional Information
- This is a hands-on IC role with broad surface area and little process between you and shipping.
- Nobody checks every box, and we're looking for someone who is excited to join the team.