Requirements
- Bachelor’s degree or higher in Computer Science, Engineering, or a related discipline, or equivalent professional background
- Minimum of 8 years of experience developing production-grade software, including at least 3 years in a technical leadership role for AI or ML product development from concept through deployment
- At least 5 years of professional experience using Python and ML/NLP libraries such as PyTorch, Hugging Face, or scikit-learn, with proven experience deploying models or ML pipelines into production environments
- Minimum of 3 years building and maintaining Spring Boot microservices in production, including responsibilities in API design, automated testing, CI/CD pipelines, and on-call incident response
- Demonstrated experience shipping at least one production system based on large language models that uses tool or function calling to interface with external services or enterprise APIs, with defined evaluation and monitoring practices
- Proven track record building and deploying retrieval-augmented generation (RAG) or semantic search solutions using vector search, embeddings, and integration with external knowledge sources such as Azure Cognitive Search or similar platforms
- Hands-on experience designing and deploying production workloads on Microsoft Azure, leveraging services like Azure OpenAI, Azure Functions, Event Hubs, Cognitive Search, and Cosmos DB, with attention to security, observability, and cost efficiency
Nice to Have
- Experience developing automated evaluation frameworks for LLMs and agent workflows, including use of golden datasets, offline and online testing, and measurable quality indicators such as task success rate, groundedness, or human review alignment
- Practical background in responsible AI practices, including adversarial testing for prompt injection and data exfiltration, safety assessments, and implementation of safeguards to minimize hallucinations and harmful outputs
- Proven ability to implement comprehensive observability for agentic systems, including distributed tracing, tool call success tracking, latency and error budgeting, and telemetry for tokens and costs with actionable alerts
- Experience designing secure architectures for tool-enabled agents, applying principles such as least privilege access, secrets management, and policy-driven controls for tool or API execution using mechanisms like OAuth scopes, managed identity, and audit logging
- Demonstrated success optimizing performance and cost efficiency in LLM or voice systems using techniques like caching, batching, streaming responses, rate limiting, model routing, and fallback mechanisms


