Responsibilities
- Architect and deploy machine learning pipelines that use large models to produce accurate ground truth labels at scale
- Improve model performance through optimized prompting, few-shot learning, and fine-tuning techniques
- Develop systems to verify labels, assign confidence scores, and validate output quality
- Assess which annotation tasks can be automated versus requiring human input, and establish clear decision rules
- Build evaluation frameworks to compare automated outputs with human-annotated datasets
- Enhance annotation accuracy by incorporating feedback from human review processes
- Lead the full lifecycle of automated annotation systems, from design to production monitoring
- Make key technical choices regarding model selection, pipeline architecture, and validation methods
- Define how annotation systems interface with existing platform infrastructure and model serving layers
- Work with Data Operations to create efficient human-in-the-loop review processes
- Support strategic planning in collaboration with senior technical leaders
- Develop high-throughput inference pipelines capable of processing massive document volumes
- Implement monitoring and alerting systems to detect quality drops or system outages
- Optimize performance by designing batching, caching, and fallback strategies that balance cost, speed, and accuracy
- Partner with Platform teams to scale model serving, APIs, and underlying infrastructure
- Document annotation methodologies, performance metrics, and system limitations clearly and thoroughly
Benefits
- Comprehensive medical, accidental, and life insurance coverage
- Weekly wellness programs focused on physical and mental health
- Generous paid time off policy