About the Role
The role involves building scalable systems for managing and improving labeled data used in AI training, with a focus on automation, tooling, and data integrity.
Responsibilities
- Develop and maintain high-throughput data labeling platforms
- Design automation workflows to reduce manual intervention in data processing
- Collaborate with machine learning teams to understand data requirements
- Improve accuracy and consistency of labeled datasets
- Build tools for data quality validation and anomaly detection
- Optimize data pipelines for speed and reliability
- Integrate feedback loops from model performance into labeling systems
- Work with large-scale sensor and simulation data
- Create dashboards and metrics for monitoring labeling operations
- Support active learning systems by prioritizing data for labeling
- Ensure data versioning and traceability across training cycles
- Develop APIs for labeling tools and data access
- Collaborate with data operations teams on workflow improvements
- Implement security and access controls for sensitive data
- Streamline data annotation processes with intelligent tooling
- Troubleshoot issues in data ingestion and processing
- Contribute to documentation for labeling systems
- Evaluate third-party labeling vendors and tools
- Scale infrastructure to meet growing data demands
- Apply software engineering best practices to data-centric workflows
Nice to Have
- Master’s degree in a technical field
- Experience with labeling platforms or annotation tools
- Knowledge of computer vision concepts
- Familiarity with robotic or simulation data
- Contributions to open-source data projects
- Experience in autonomous vehicle or robotics domains
- Background in human-in-the-loop systems
- Understanding of model-driven data prioritization
- Work with large-scale data labeling operations
- Prior success in improving labeling efficiency
Compensation
Competitive salary and equity package
Work Arrangement
Hybrid work model with flexibility for remote and office-based work
Team
Part of the core engineering team focused on data infrastructure and automation systems
About the Team
This team builds the foundational data systems that power AI model development, focusing on automation, scalability, and data quality.
Impact
Engineers directly influence the speed and accuracy of model training by improving how data is collected, labeled, and utilized.
Available for qualified candidates