Remote (Global)

LILT is hiring an AI Benchmark Engineer - Native Language Specialist | Czech

Responsibilities

  • Assess the performance of AI coding agents in real-world scenarios
  • Create authentic task environments using datasets and files in Czech to test true multilingual capabilities
  • Identify cases where AI systems fail to understand or generate correct output in Czech
  • Help develop strong reference solutions and write precise, deterministic verification scripts, using rubric-based evaluation only when essential
  • Review execution logs and adjust task difficulty levels from Easy to Very Hard using standard Terminal-Bench configurations across different model types
  • Engage in a four-stage human quality assurance process—task creation, human review, calibration review, and audit—combined with automated LLM checks to maintain fairness, linguistic correctness, and benchmark reliability

Benefits

  • Receive competitive compensation
  • Enjoy engaging and meaningful work
  • Contribute to advancements in artificial intelligence and language technology
  • Connect with professionals in a collaborative, global community
  • Apply through a simplified process designed for skilled contributors

Compensation

Paid work

Work Arrangement

Remote

Team

Global, multilingual team

Target Languages

Spanish, German, Czech, Turkish, Arabic (Egyptian), Korean, Japanese, Hausa, Hindi, Marathi

Work Flexibility

Work on varied projects remotely, at times convenient for you

Payment Terms

Receive timely and equitable payments

Application Requirements

Submit your CV in English

Hiring Process

  • AI and automated tools may assist in screening résumés, scoring assessments, and analyzing interviews
  • Final hiring decisions are made by human reviewers
  • Candidates can choose to opt out of AI-assisted hiring by contacting recruiting@lilt.com
  • The company follows fair, inclusive, and transparent hiring practices

Equal Opportunity Employer

Does not discriminate based on race, religion, color, national origin, ancestry, sex, sexual orientation, gender identity, age, disability, medical condition, genetic characteristics, veteran status, marital status, pregnancy, or other protected categories

Not applicable

Required Skills
PythonShell ScriptingMachine LearningNatural Language ProcessingAI BenchmarkingData AnalysisCzech LanguageEnglish LanguageQuality EvaluationStatistical AnalysisLarge Language ModelsAI/ML Systems
About company
LILT
LILT builds multilingual AI and human-verified services that make the world's information available to everyone, regardless of language. The company serves Enterprises, Governments, and AI Developers worldwide.
All jobs at LILT Visit website
Job Details
Category other
Posted 3 months ago