Zendesk is seeking a Senior Machine Learning Engineer to lead the next wave of GenAI infrastructure development. You'll build and operate benchmarking, evaluation, and inference systems that ensure our AI-driven customer support experiences are reliable, safe, and cost-effective. This role is based in our Pune, India office.
What You'll Do
- Build and maintain benchmarking frameworks for LLMs (A/B tests, offline and end-to-end evaluation pipelines) to measure quality, latency, and cost trade-offs.
- Design, develop, and operate LLM Proxy functionality: routing, safety filters, caching, rate limiting, and cost attribution across vendors.
- Implement monitoring, observability, and alerting for LLM services (latency, error rates, hallucination signals, cost per call).
- Lead the design and automation of evaluation suites and gold-standard datasets for ticket replies, summaries, intent detection, and recommendations.
- Collaborate with applied ML, product, and platform teams to translate evaluation findings into model improvements, release criteria, and mitigation strategies.
- Build orchestration tooling for multi-step agentic workflows and integrations that safely interact with external tools and Zendesk products.
- Establish engineering best practices around security, reliability, testing, CI/CD, and performance tuning for ML services.
- Mentor engineers, help define roadmap priorities, and drive cross-team initiatives that increase platform adoption and developer productivity.
What We're Looking For
- BS in Computer Science, Engineering, or related field, or equivalent practical experience.
- 5+ years industry experience delivering production ML systems or backend services.
- Proficiency in Python and experience with containerized deployments and cloud infrastructure.
- Proven experience designing and running model evaluation, monitoring, or A/B testing for ML services.
- 5+ years building and running ML systems or backend platforms in production, with strong ownership from design to deployment.
- Hands-on experience with LLM systems, inference-serving patterns, or GenAI application infrastructure (prompting, caching, vendor routing).
- Strong backend engineering skills (Python or another server-side language), familiarity with distributed systems, and production CI/CD practices.
- Experience with Kubernetes, Docker, cloud platforms (AWS/GCP/Azure), and monitoring/observability tooling.
- Demonstrated ability to design evaluation pipelines, build gold-standard datasets, or run rigorous A/B and offline evaluations.
- Clear communicator who can translate technical tradeoffs into product decisions and measurable outcomes.
- Curious, collaborative, and committed to building reliable, safe, and cost-effective GenAI infrastructure.
Nice to Have
- Experience with LLM vendors (OpenAI, Anthropic, Google, Llama) and vector databases.
- Background in building agentic orchestration or tool-using AI systems.
- Prior experience implementing cost attribution, vendor routing, or request-level caching for LLMs.
- Advanced degree (MS/PhD) or demonstrated research contributions in ML/NLP.
Technical Stack
- Languages: Python
- Infrastructure: Kubernetes, Docker
- Cloud Platforms: AWS, GCP, Azure
Team & Environment
You will be part of the AI/ML Platform team, collaborating across applied ML, product, and platform engineering to scale Zendesk's GenAI capabilities.
Work Mode
This is a hybrid position located in Pune, India.
Zendesk is an equal opportunity employer, and we’re proud of our ongoing efforts to foster global diversity, equity, & inclusion in the workplace. We are an AA/EEO/Veterans/Disabled employer.



