OpenAI is hiring a Research Engineer/Scientist focused on RLHF and post-training for personalized, multimodal AI systems. You’ll join a team building the learning and evaluation foundations to make models more context-aware, adaptive, and useful over time.
What You'll Do
- Develop RLHF and post-training methods for multimodal models.
- Build reward models and preference-learning pipelines for adaptive, personalized model behavior.
- Design datasets, rubrics, and evaluation frameworks that capture user preferences, contextual appropriateness, and long-term value in realistic tasks.
- Run experiments on policy improvement using explicit feedback, implicit signals, and model-based grading.
- Work on long-horizon evaluation problems, where model quality depends on whether behavior improves outcomes over time.
- Collaborate closely with safety researchers to ensure adaptation and personalization remain aligned, interpretable, and bounded by clear constraints.
- Prototype and iterate quickly on training recipes, reward formulations, data pipelines, and evaluation suites for product-relevant behaviors.
- Help define how OpenAI measures success for personalized AI systems including trust, appropriateness, and long-term user benefit.
What We're Looking For
- Strong background in machine learning research, with experience in RLHF, reward modeling, preference optimization, or post-training for large models.
Nice to Have
- Experience in one or more of: reinforcement learning, ranking, recommender systems, personalization, memory, or human-in-the-loop evaluation.
- Experience building datasets or eval pipelines grounded in human preferences, rubrics, or real-world product behavior.
- Interest in multimodal AI and in how models can learn from richer interaction signals over time.
- Desire to work on product-shaping research with high stakes for trust, alignment, and long-term user value.
- Enjoy close collaboration with engineers, designers, and safety researchers to turn frontier research into real systems.
Team & Environment
You'll join an applied research team within the Consumer Devices group.
Work Mode
This is a hybrid role based in San Francisco, CA.
OpenAI is an equal opportunity employer, and we do not discriminate on the basis of race, religion, color, national origin, sex, sexual orientation, age, veteran status, disability, genetic information, or other applicable legally protected characteristic.


