About the Role

The role involves building scalable systems for managing and improving labeled data used in AI training, with a focus on automation, tooling, and data integrity.

Responsibilities

Develop and maintain high-throughput data labeling platforms
Design automation workflows to reduce manual intervention in data processing
Collaborate with machine learning teams to understand data requirements
Improve accuracy and consistency of labeled datasets
Build tools for data quality validation and anomaly detection
Optimize data pipelines for speed and reliability
Integrate feedback loops from model performance into labeling systems
Work with large-scale sensor and simulation data
Create dashboards and metrics for monitoring labeling operations
Support active learning systems by prioritizing data for labeling
Ensure data versioning and traceability across training cycles
Develop APIs for labeling tools and data access
Collaborate with data operations teams on workflow improvements
Implement security and access controls for sensitive data
Streamline data annotation processes with intelligent tooling
Troubleshoot issues in data ingestion and processing
Contribute to documentation for labeling systems
Evaluate third-party labeling vendors and tools
Scale infrastructure to meet growing data demands
Apply software engineering best practices to data-centric workflows

Nice to Have

Master’s degree in a technical field
Experience with labeling platforms or annotation tools
Knowledge of computer vision concepts
Familiarity with robotic or simulation data
Contributions to open-source data projects
Experience in autonomous vehicle or robotics domains
Background in human-in-the-loop systems
Understanding of model-driven data prioritization
Work with large-scale data labeling operations
Prior success in improving labeling efficiency

Compensation

Competitive salary and equity package

Work Arrangement

Hybrid work model with flexibility for remote and office-based work

Team

Part of the core engineering team focused on data infrastructure and automation systems

About the Team

This team builds the foundational data systems that power AI model development, focusing on automation, scalability, and data quality.

Impact

Engineers directly influence the speed and accuracy of model training by improving how data is collected, labeled, and utilized.

Available for qualified candidates

Waabi is hiring a Software Engineer, Labelling, Data & Automation