About the Role
This position involves conducting research and engineering to improve vision capabilities in AI models, including image understanding and multimodal reasoning, with an emphasis on robustness and alignment.
Responsibilities
- Design and implement experiments to advance computer vision components in AI systems
- Collaborate with researchers to integrate visual understanding into multimodal models
- Develop datasets and evaluation frameworks for image-based tasks
- Optimize model performance on visual reasoning benchmarks
- Investigate how models interpret and generate visual content
- Contribute to scalable training pipelines involving image and text data
- Analyze failure modes in vision-language systems
- Support deployment of vision features in research prototypes
- Publish findings in academic or technical venues when appropriate
- Work closely with safety teams to evaluate visual model behavior
Nice to Have
- PhD in machine learning, computer vision, or related area
- Publications in top-tier vision or AI conferences
- Hands-on experience with transformer-based vision models
- Contributions to open-source vision projects
- Experience with 3D vision or video understanding
- Background in model safety or alignment research
- Familiarity with adversarial robustness in vision systems
- Track record of shipping research into production systems
Compensation
Competitive salary based on experience and location
Work Arrangement
Hybrid or remote options available depending on team and location
Team
Part of the core research team focused on advancing multimodal AI systems
Research Focus
Work will center on improving how AI systems process and reason about visual inputs, including still images and diagrams, in coordination with language understanding.
Impact
Research outcomes may inform safer and more reliable AI systems capable of accurately interpreting visual information in real-world contexts.
Available for qualified candidates requiring sponsorship