Mindrift is hiring a Freelance AI Evaluation Scenario Writer to design structured, realistic test cases that simulate human-performed tasks for LLM-based agents. You will be responsible for defining gold-standard behavior and creating clear, reusable scenarios that help shape future AI models.
What You'll Do
- Design structured test scenarios based on real-world tasks.
- Define the golden path and acceptable agent behavior for comparison.
- Annotate task steps, expected outputs, and edge cases.
- Work with developers to test scenarios and improve their clarity.
- Review agent outputs and adapt tests accordingly.
What We're Looking For
- A Bachelor's and/or Master's Degree in Computer Science, Software Engineering, Data Science, AI/ML, Computational Linguistics/NLP, Information Systems, or a related field.
- Background in QA, software testing, data analysis, or NLP annotation.
- A good understanding of test design principles like reproducibility, coverage, and edge cases.
- Strong written communication skills in English.
- Comfort with structured formats like JSON or YAML for scenario description.
- Ability to define expected agent behaviors and scoring logic.
- Basic experience with Python and JavaScript.
- Curiosity and openness to working with AI-generated content, agent logs, and prompt-based behavior.
- Readiness to learn new methods, switch between tasks quickly, and sometimes work with challenging guidelines.
Nice to Have
- Experience in writing manual or automated test cases.
- Familiarity with LLM capabilities and typical failure modes.
- Understanding of scoring metrics like precision, recall, coverage, and reward functions.
Technical Stack
- Python
- JavaScript
Benefits & Compensation
- Contribute on your own schedule, from anywhere in the world.
- Get paid for your expertise, with rates that can go up to $22/hour depending on skills, experience, and project needs.
- Take part in a flexible, remote, freelance project that fits around primary professional or academic commitments.
- Participate in an advanced AI project and gain valuable experience to enhance your portfolio.
- Influence how future AI models understand and communicate in your field of expertise.
Work Mode
This is a fully global, freelance position that is remote and flexible.
Mindrift believes in using the power of collective human intelligence to ethically shape the future of AI.




