Protege is hiring a Senior Data Scientist

Responsibilities

Create and implement statistical and machine learning techniques to process and enhance massive unstructured datasets
Build systems to measure data variety, redundancy, and informational value
Develop statistical methodologies to reduce risks in training data composition
Work closely with modeling teams to detect data limitations and improve dataset effectiveness
Demonstrate experience collaborating across large foundational model projects and early-stage startups
Lead initiatives in data quality planning and define internal standards for best practices
Assess third-party datasets for potential integration, prioritizing scalability, accuracy, and impact on model outcomes
Support the creation of data scorecards to track quality and performance metrics
Assist in researching and developing automated tools for data preprocessing and validation

Required Skills

Machine LearningStatistical ModelingPythonSQLData VisualizationA/B TestingExperimental DesignCommunicationProject ManagementStakeholder Management

About company

Protege solves the biggest unmet need in AI — getting access to the right training data. The Protege platform facilitates the secure, efficient, and privacy-centric exchange of AI training data.

All jobs at Protege Visit website

Job Details

Category data

Posted 4 months ago