Zendesk is seeking a Machine Learning Engineer to shape the architecture and strategy of our GenAI platform. In this role, you will lead large, cross-functional efforts that standardize evaluation, access, observability, and orchestration for LLMs across product lines, making AI experiences safe, performant, and trustworthy for millions of end users.
What You'll Do
- Architect and lead delivery of cross-product GenAI platform capabilities: LLM Proxy, model registry integrations, vendor abstraction, and cost/usage attribution.
- Own the design and scaling of evaluation and benchmarking frameworks (A/B, offline, continuous regression tests) used to gate model releases.
- Define company-wide standards for safety, tone, and reasoning evaluation; drive adoption of evaluation rubrics and automated checks.
- Identify systemic failure modes across products and model families; prioritize mitigations, monitoring, and retraining strategies in partnership with ML teams.
- Drive platform reliability, observability, and capacity planning for LLM services; implement rate limiting, throttling, and SLA practices.
- Lead efforts to enable agentic workflows and safe tool use, defining integration patterns and security boundaries.
- Partner with engineering leadership, product, research, and legal/policy teams to translate risk, cost, and quality tradeoffs into platform design decisions.
- Mentor senior engineers, coordinate cross-team roadmaps, and represent the platform in technical forums.
What We're Looking For
- BS in Computer Science, Engineering, or related field, or equivalent practical experience.
- 8+ years industry experience in backend, platform, or ML infrastructure engineering with major production responsibilities.
- Demonstrable experience with cloud-native infrastructure (Kubernetes, AWS/GCP/Azure) and production ML/LLM systems.
- Strong track record of building evaluation and monitoring for ML systems.
- 8+ years building distributed systems and ML infrastructure with a track record delivering large, cross-team projects to production.
- Deep understanding of LLMs, inference serving patterns, vendor routing strategies, and platform design for ML workloads.
- Strong system design skills: scalable architectures, service reliability engineering, capacity planning, and cost optimization.
- Proficiency in Python (or comparable server-side language), Kubernetes, cloud infrastructure, and observability tooling.
- Experience creating evaluation frameworks, gold-standard datasets, and regression suites for language models.
- Excellent stakeholder skills: you can synthesize product, research, and engineering constraints into pragmatic platform solutions.
- Proven ability to lead technical strategy and mentor senior engineers to achieve broad adoption.
Nice to Have
- Experience building model registries, feature stores, or inference platforms at scale.
- Background in agentic AI frameworks, workflow orchestration, or tool-using models.
- Prior experience influencing company-wide ML safety, trust, or quality frameworks.
- Advanced degree (MS/PhD) in ML/NLP or related field and/or published research in relevant areas.
Technical Stack
- Python
- Kubernetes
- AWS, GCP, Azure
Work Mode
This is a hybrid position based in Pune, India.
Zendesk is an equal opportunity employer, and we’re proud of our ongoing efforts to foster global diversity, equity, & inclusion in the workplace. We are an AA/EEO/Veterans/Disabled employer.




