Poland Remote (Global) Employment

Mindrift is hiring an Evaluation Scenario Writer - AI Agent Testing Specialist

About the Role

Mindrift is looking for an Evaluation Scenario Writer - AI Agent Testing Specialist to design structured test scenarios that evaluate the performance of LLM-based agents. You will create realistic simulations of human-performed tasks and define gold-standard behavior to measure agent actions against.

What You'll Do

  • Design structured test scenarios based on real-world tasks.
  • Define the golden path and acceptable agent behavior.
  • Annotate task steps, expected outputs, and edge cases.
  • Work with developers to test your scenarios and improve clarity.
  • Review agent outputs and adapt tests accordingly.

What We're Looking For

  • Bachelor's or Master’s degree in Computer Science, Software Engineering, Data Science / Data Analytics, Artificial Intelligence / Machine Learning, Computational Linguistics / Natural Language Processing (NLP), Information Systems, or other related fields.
  • 3+ years of relevant experience.
  • Advanced (C1) or above level of English proficiency.
  • Ready to learn new methods, able to switch between tasks and topics quickly.
  • Able to sometimes work with challenging, complex guidelines.
  • Have a laptop, reliable internet connection, and available time.

Benefits & Compensation

  • Take part in a part-time, remote, freelance project that fits around your primary professional or academic commitments.
  • Work on advanced AI projects and gain valuable experience that enhances your portfolio.
  • Influence how future AI models understand and communicate in your field of expertise.

Work Mode

This is a global, remote opportunity.

Required Skills
AI Agent TestingTest Scenario DesignPrompt EngineeringCritical ThinkingAnalytical SkillsWritten CommunicationAttention to DetailLLM EvaluationCreative WritingQuality Assurance
Want to work from Thailand?

Join a remote network built for tech talent

Iglu gives you real employment in Southeast Asia — visa, work permit, and projects included. Pick what you work on, earn performance-based pay, and live where you want.

Legal employment in Thailand & Vietnam
Choose your own projects
Performance-based revenue sharing
Relocation support available
Join Iglu
200+ professionals worldwide
About company
Mindrift

Mindrift connects specialists with AI projects from major tech innovators. Their mission is to unlock the potential of Generative AI by tapping into real-world expertise from across the globe.

Visit website
Job Details
Category qa_testing
Posted 7 months ago