Responsibilities
- Architect and Optimize Systems: Design and oversee the development of scalable data pipelines for complex model training and real-time inference.
- Advanced LLM Development: Lead the fine-tuning, evaluation, and optimization of Large Language Models (LLMs) specifically for production-level Agentic Digital Assistants.
- Production & Infrastructure Leadership: Direct the deployment of open-source and proprietary models on remote servers, ensuring high performance, low latency, and cost-efficiency.
- Strategic Integration: Work closely with cross-functional engineering leads to integrate sophisticated ML components into broader system architectures.
- Model Governance: Establish robust monitoring frameworks to track model performance and implement automated retraining loops to maintain quality and relevance.
- R&D Mentorship: Stay at the forefront of AI research and tools, translating new techniques into actionable strategies for the team.
Requirements
- Deep LLM Expertise: Extensive experience with transformers and advanced techniques in fine-tuning, prompt engineering, and rigorous model evaluation.
- Senior Production Track Record: A proven history of taking complex ML projects from research notebooks to successful, large-scale production environments.
- Expert Programming & Framework Knowledge: Mastery of Python and deep learning.
- MLOps Mastery: Deep familiarity with professional MLOps tooling (e.g., MLflow, Weights & Biases, Docker) and cloud-native architectures on OCI, AWS or GCP.
- Strategic Builder Mentality: A drive to ship fast and iterate based on user data, while maintaining a long-term technical vision for product growth.
- Collaborative Leadership: Strong communication skills with the ability to lead remote-first teams and foster a culture of technical excellence and inclusion