US - Remote Remote (Country) Full-time

Sully.ai is hiring an Applied Research Scientist

Responsibilities

  • Build and scale automated evaluation pipelines (LLM-as-judge + human review) with clinical-grade benchmarks.

Requirements

  • Proven experience designing agentic processes and LLM evaluation/benchmarking frameworks.
  • Strong Python and ML background (PyTorch/TensorFlow, Hugging Face, LangChain/LlamaIndex).
  • Demonstrated ability to design rigorous experiments and translate findings into production.
  • Track record of published research or deep applied work in LLMs and agent evaluation.
  • Strong communication and technical writing skills to articulate complex findings clearly.
About company
Sully.ai
Sully.ai is transforming healthcare access by integrating AI into medical workflows and automating healthcare administrative tasks throughout the patient visit cycle - enhancing efficiency, reducing errors, and supporting real-time decision-making.
All jobs at Sully.ai Visit website
Job Details
Department Engineering
Category other
Posted 4 months ago