Responsibilities
- Architect and build features for AI agents, expanding their functionality and performance
- Optimize context management in agents using methods such as sub-agents, retrieval-augmented techniques, and sliding windows
- Integrate and utilize language models from leading providers including OpenAI, Anthropic, and Google
- Assess and select optimal models for specific tasks using quantitative evaluation
- Evaluate new model versions and features through early access testing and performance analysis
- Enable secure and efficient use of external tools and APIs by AI agents
- Design structured action frameworks for operations like web searches, database access, and information retrieval
- Use development tools such as Vercel’s AI SDK and LangGraph to create multi-step AI workflows
- Collaborate with immediate and cross-functional teams to deliver AI-driven functionality
- Work with engineers and product teams to ensure features are robust, scalable, and production-deployable
- Mentor and support mid-level and junior engineers to strengthen team capabilities
- Gather and organize response data from multi-turn interactions to analyze agent behavior
- Study patterns in conversations, errors, and successes to inform system improvements
- Keep current with advancements in natural language processing and large language models
- Test innovative approaches including prompting techniques, context strategies, and fine-tuning possibilities
- Monitor system performance through structured testing and user feedback loops
- Refine prompts, agents, and workflows to enhance accuracy and consistency
- Create automated benchmarks to evaluate large language model outputs systematically
Work Arrangement
Remote (Worldwide)