Lead AI Engineer (Generative AI & LLMOps)
Role Overview
Take ownership of the core AI architecture for a next-generation project, shaping how generative intelligence is designed, deployed, and maintained. This role demands a technical leader who can bridge advanced AI concepts with real-world engineering constraints, ensuring systems are performant, reliable, and aligned with product goals.
Key Responsibilities
- Design and build the foundational AI components powering intelligent features
- Develop and refine Retrieval-Augmented Generation pipelines, including data indexing, retrieval strategies, and response generation
- Choose and integrate large language models based on performance, cost, and operational needs
- Ensure AI-generated content is factually grounded, secure, and consistent with quality standards
- Collaborate closely with technical leadership to embed AI services within the broader software ecosystem
- Guide engineers in AI best practices, from prompt design to system observability
Required Qualifications
- 8+ years in software engineering, with at least 2 years focused on building and shipping GenAI applications
- Extensive experience with LLM orchestration tools such as LangChain, LlamaIndex, or Haystack
- Proven track record implementing RAG systems, including embedding pipelines and vector database optimization
- Advanced skills in prompt engineering, including structured prompting and hallucination reduction techniques
- Familiarity with trade-offs between proprietary and open-source LLMs, and experience hosting models via Hugging Face or vLLM
- Strong Python proficiency, particularly in asynchronous and data-intensive environments
- Experience setting up evaluation frameworks to monitor accuracy, latency, and cost (e.g., RAGAS, TruLens, LangSmith)
- Ability to design resilient backend APIs that handle streaming, errors, and non-deterministic model behavior
- Fluency in English (C1 or higher), with the ability to clearly communicate technical AI concepts to non-technical audiences
Preferred Qualifications
- Hands-on experience fine-tuning models using PEFT, LoRA, or QLoRA for domain-specific tasks
- Knowledge of LLMOps pipelines and deployment automation using BentoML, Modal, or AWS SageMaker
- Understanding of AI security risks such as prompt injection and data leakage, and mitigation strategies
- Experience with multi-modal models involving vision, audio, or speech processing
- Product-oriented mindset with attention to user experience in AI-driven workflows
Technical Environment
LangChain, LlamaIndex, Haystack, Pinecone, pgvector, RAGAS, TruLens, LangSmith, FastAPI, Flask, Hugging Face, vLLM, OpenAI, Anthropic, Gemini, Llama 3, Mistral, BentoML, Modal, AWS SageMaker
