Full-time

Protege is hiring a Senior Data Scientist

About the Role

Protege is looking for a Senior Data Scientist to be at the heart of how we curate, assess, and prepare the training data that powers real-world AI systems. You'll lead the evaluation and optimization of large-scale datasets used to train state-of-the-art AI models.

What You'll Do

  • Design and apply statistical and machine learning methods to curate, filter, and enrich large-scale unstructured datasets
  • Develop frameworks to assess data diversity, duplication, and informativeness
  • Design statistical approaches to de-risk training datasets
  • Collaborate with model training teams to identify data bottlenecks and optimize dataset performance
  • Provide leadership on data quality strategy and shape internal best practices
  • Evaluate external datasets for integration, focusing on scalability, quality, and relevance to model performance
  • Help build data scorecards
  • Contribute to research and development of tools that automate data preprocessing and validation

What We're Looking For

  • PhD or equivalent Master's Degree + 4+ years industry experience in machine learning, economics, mathematics, engineering, computer science, statistics, or a related quantitative field
  • Strong understanding of AI model training pipelines, including pre-processing and evaluation
  • Experience working with large, unstructured datasets, especially text
  • Background in statistical analysis, bias detection, and data validation
  • Able to identify high-impact problems and drive independent solutions

Nice to Have

  • Experience with synthetic data generation or augmentation strategies
  • Publications or open-source contributions in data-centric AI or related areas
  • Experience developing evaluation frameworks or performance metrics for training data
  • Cross-functional collaboration with product, infrastructure, or partnership teams

Team & Environment

You will collaborate with research and engineering teams within our lean, fast-moving, high-trust environment. Our culture is built for people who thrive on ambiguity, own outcomes, and want to shape the future of data and AI.

Required Skills
machine learningstatistical modelingPythonSQLdata visualizationA/B testingexperimental designcommunicationproject managementstakeholder management
Want to work from Thailand?

Join a remote network built for tech talent

Iglu gives you real employment in Southeast Asia — visa, work permit, and projects included. Pick what you work on, earn performance-based pay, and live where you want.

Legal employment in Thailand & Vietnam
Choose your own projects
Performance-based revenue sharing
Relocation support available
Join Iglu
200+ professionals worldwide
About company
Protege

Protege solves the biggest unmet need in AI — getting access to the right training data. The Protege platform facilitates the secure, efficient, and privacy-centric exchange of AI training data.

Visit website
Job Details
Category data
Posted a month ago