Poland Remote (Global) Employment

Mindrift is hiring an Evaluation Scenario Writer - AI Agent Testing Specialist

About the Role

Mindrift is looking for an Evaluation Scenario Writer - AI Agent Testing Specialist to design structured test scenarios that evaluate the performance of LLM-based agents. You will create realistic simulations of human-performed tasks and define gold-standard behavior to measure agent actions against.

What You'll Do

  • Design structured test scenarios based on real-world tasks.
  • Define the golden path and acceptable agent behavior.
  • Annotate task steps, expected outputs, and edge cases.
  • Work with developers to test your scenarios and improve clarity.
  • Review agent outputs and adapt tests accordingly.

What We're Looking For

  • Bachelor's or Master’s degree in Computer Science, Software Engineering, Data Science / Data Analytics, Artificial Intelligence / Machine Learning, Computational Linguistics / Natural Language Processing (NLP), Information Systems, or other related fields.
  • 3+ years of relevant experience.
  • Advanced (C1) or above level of English proficiency.
  • Ready to learn new methods, able to switch between tasks and topics quickly.
  • Able to sometimes work with challenging, complex guidelines.
  • Have a laptop, reliable internet connection, and available time.

Benefits & Compensation

  • Take part in a part-time, remote, freelance project that fits around your primary professional or academic commitments.
  • Work on advanced AI projects and gain valuable experience that enhances your portfolio.
  • Influence how future AI models understand and communicate in your field of expertise.

Work Mode

This is a global, remote opportunity.

Required Skills
AI Agent TestingTest Scenario DesignPrompt EngineeringCritical ThinkingAnalytical SkillsWritten CommunicationAttention to DetailLLM EvaluationCreative WritingQuality Assurance
Need to work legally in Thailand?

Work permits without the paperwork nightmare

Thai immigration rules are strict and easy to get wrong. SVBL handles the bureaucracy — correct visa type, proper documentation, timely submissions. You focus on your work.

Right visa type for your situation
Document preparation & submission
Deadline tracking & renewals
Direct liaison with immigration
Talk to an expert
10+ years experience
About company
Mindrift

Mindrift connects specialists with AI projects from major tech innovators. Their mission is to unlock the potential of Generative AI by tapping into real-world expertise from across the globe.

Visit website
Job Details
Category qa_testing
Posted 7 months ago