Mindrift is looking for a hands-on MCP & Tools Python Developer to join a project focused on building infrastructure to evaluate AI agent behavior. You’ll develop Model Context Protocol servers and internal tools to run and assess agent actions.
What You'll Do
- Develop and maintain MCP-compatible evaluation servers
- Implement logic to check agent actions against scenario definitions
- Create or extend tools that writers and QAs use to test agents
- Work closely with infrastructure engineers to ensure compatibility
- Occasionally help with test writing or debug sessions when needed
What We're Looking For
- 4+ years of Python development experience, ideally in backend or tools
- Solid experience building APIs, testing frameworks, or protocol-based interfaces
- Understanding of Docker, Linux CLI, and HTTP-based communication
- Ability to integrate new tools into existing infrastructures
- Familiarity with how LLM agents are prompted, executed, and evaluated
- Clear documentation and communication skills—you’ll work with QA and writers
Nice to Have
- Experience with Model Context Protocol (MCP) or similar structured agent-server interfaces
- Knowledge of FastAPI or similar async web frameworks
- Experience working with LLM logs, scoring functions, or sandbox environments
- Ability to support dev environments (devcontainers, CI configs, linters)
- JS experience
Technical Stack
- Python
- Docker
- Linux CLI
- HTTP
- FastAPI
- JS
Team & Environment
You will work closely with infrastructure engineers, QA, and writers.
Benefits & Compensation
- Get paid for your expertise, with rates that can go up to $17/hour depending on your skills, experience, and project needs
- Take part in a flexible, remote, freelance project that fits around your primary professional or academic commitments
- Participate in an advanced AI project and gain valuable experience to enhance your portfolio
- Influence how future AI models understand and communicate in your field of expertise
Work Mode
This is a fully remote position.
We believe in using the power of collective human intelligence to ethically shape the future of AI.


