The Senior Machine Learning Scientist will join a global team of scientists and engineers at Turnitin to develop and deploy Machine Learning systems across learning, teaching, and academic integrity products. This role involves researching, developing, and productionizing novel ML models with global scale and impact, working closely with cross-functional teams to integrate AI into core platforms.
What You'll Do
- Research and develop production grade Machine Learning models
- Optimize models for scaled production usage
- Work with colleagues in the AI team, other Engineering teams, subject matter experts, Product Management, Marketing, Sales, and Customer support to explore product issues, challenges, and opportunities
- Recommend innovative ML/AI based solutions
- Help out with ad-hoc one-off tasks as a team player within the AI team
- Work with subject matter experts to curate and generate optimal datasets following responsible data collection and model maintenance practices
- Explore and access SQL, no-SQL and web data and write efficient parallel pipelines
- Review and design datasets to ensure data quality
- Investigate weaknesses of models in production and work on pragmatic solutions
- Utilize, adopt, and fine-tune off the shelf models, including LLMs exposed via API (through prompt engineering and agents) and locally hosting LMs and other foundation models
- Stay current in the field - read research papers, experiment with new architectures and LLMs, and share your findings
- Write clean, efficient, and modular code with automated tests and appropriate documentation
- Stay up to date with technology and platforms, make good technological choices, and be able to explain them to the organization
- Work with downstream teams to productionize your work and ensure that it makes into a product release
- Communicate insights, as well as the behavior and limitations of models, to peers, subject matter experts, and product owners
- Present and publish your work
What We're Looking For
- Well-balanced set of skills in both the Science and Software Engineering aspects of (Deep) Machine Learning
- Ability to construct novel model architectures, loss functions, training methods, and training loops
- Proficiency in the mathematics of machine learning and deep neural networks
- Ability to keep abreast of the latest research advancements in AI and Deep Learning across modalities
- Experience writing custom training loops
- Production level coding and software engineering proficiency
- Ability to train large models (up to 100s of billions of parameters)
- Ability to train on multiple GPUs and nodes
- Knowledge of the latest model training and inferencing advancements
- Sufficiently deep Computer Science background to deliver models with high accuracy and low compute-cost
- Ability to write parallel and efficient pipelines for large datasets (billions of samples)
- Experience with dataset exploration, generation (synthetic), design, construction, and analysis
Nice to Have
- Experience presenting work within the company
- Experience publishing work in peer-reviewed venues (preferably A/A+ rated)
Technical Stack
- Deep Learning
- Machine Learning
- AI
- LLMs
- Prompt Engineering
- Agents
- Foundation Models
- SQL
- No-SQL
- Web Data
- Parallel Pipelines
- GPU Training
- Multi-node Training
- Model Deployment
- Automated Testing
- Research Paper Reading
Team & Environment
- global team
- cross-functional team of scientists and engineers
Our culture is curious, helpful, and independent, with a commitment to deliver cutting-edge, well-engineered Machine Learning systems.
Work Mode
Remote work within the USA
