Responsibilities
- Create and deploy enterprise-level Generative AI functionalities leveraging Large Language Models, embedding techniques, and multimodal frameworks such as Llama, Mistral, Claude, GPT series, and Stable Diffusion.
- Construct and refine Retrieval-Augmented Generation pipelines, vector storage solutions like Pinecone, Weaviate, Qdrant, Chroma, or PGVector, and develop prompt engineering and tool-integration workflows.
- Build and maintain Python-based backend applications using FastAPI, Flask, or Django, along with orchestration logic for AI agents and multi-phase processes.
- Fine-tune and align open-source large language models using parameter-efficient methods like LoRA, QLoRA, and DPO, supported by frameworks including Hugging Face Transformers, PEFT, TRL, Unsloth, and Axolotl.
- Embed generative AI features into current software platforms, enabling chatbots, co-pilot tools, content creation, code assistance, and image or video generation functionalities.
Responsibilities
- Design and implement production-grade Generative AI features using Large Language Models (LLMs) embedding models and multimodal architectures (Llama Mistral Claude GPT series Stable Diffusion etc.).
- Build and optimize Retrieval-Augmented Generation (RAG) pipelines vector databases (Pinecone Weaviate Qdrant Chroma PGVector) and prompt engineering/tool-calling workflows.
- Develop Python backend services (FastAPI Flask Django) and orchestration layers for AI agents and multi-step workflows.
- Fine-tune and align open-source LLMs (LoRA QLoRA DPO) using frameworks such as Hugging Face Transformers PEFT TRL Unsloth and Axolotl.
- Integrate GenAI capabilities into existing products (chatbots co-pilots content generation code assistants image/video generation etc.).
