Responsibilities
- Define AI system architecture aligned to business objectives and long-term scalability
- Translate ambiguous AI opportunities into structured technical approaches
- Design modular, extensible AI platforms supporting experimentation and production workloads
- Establish architecture decision records (ADRs) and technical documentation standards
- Lead design and implementation of ML systems, model services, and inference pipelines
- Build scalable training, evaluation, and deployment workflows
- Architect data ingestion, feature engineering, and model serving strategies
- Ensure reliability, observability, and performance optimization of AI systems
- Design and implement end-to-end AI pipelines and agentic orchestration frameworks
- Establish LLMOps best practices including CI/CD, prompt versioning, and real-time model observability
- Optimize distributed inference systems and retrieval-augmented generation (RAG) performance
- Ensure AI infrastructure aligns with security, compliance, and GPU/token cost-efficiency standards
- Bridge research and engineering by operationalizing prototypes into production-ready systems
- Define evaluation metrics, validation processes, and rollout strategies
- Improve iteration velocity through tooling and automation
- Mentor junior and mid-level AI engineers
- Lead code reviews and enforce engineering standards
- Guide teams in making informed tradeoffs across performance, scalability, cost, and complexity
- Contribute to shared AI best practices and internal accelerators
- Partner with product, engineering, and executive stakeholders to shape AI roadmaps
- Communicate technical strategy and system tradeoffs clearly to non-technical audiences
- Align AI initiatives with business value and measurable outcomes