Responsibilities
- Develop and enhance Document AI models to identify entities, relationships, and document structures within intricate legal and medical documents.
- Address modeling difficulties related to extended context windows and reasoning across multiple documents, including fact distribution and segmentation.
- Lead efforts in fine-tuning large language models using methods such as reinforcement learning with auditable rewards and parameter-efficient approaches like LoRA and QLoRA.
- Define and implement strict evaluation protocols to minimize false outputs, increase factual reliability, and manage uncertain or low-quality inputs.
- Ensure data integrity through direct analysis, focusing on training and test dataset quality, edge case handling, noise reduction, and drift detection.
- Test and assess advanced prompting strategies such as few-shot learning and chain-of-thought, optimizing context usage against extraction precision.
- Offer technical guidance and mentorship to machine learning engineers and data scientists, promoting high engineering standards and professional development.
- Work across teams with product, engineering, and domain specialists in legal and medicine to convert exploratory objectives into deployable systems.
- Serve as a link between emerging research and real-world implementation, integrating novel methods into production pipelines efficiently.
Work Arrangement
Hybrid — San Francisco, Toronto