Freenome is seeking a Senior Machine Learning Research Engineer to join the Machine Learning Science (MLS) team within the Computational Science department. The role involves developing and deploying infrastructure to support deep learning models using large-scale genomic data, optimizing distributed training pipelines, and collaborating with interdisciplinary teams to advance early cancer detection through AI/ML innovation.
What You'll Do
- Implement and refine DL pipelines on distributed computing platforms enhancing the speed and efficiency of DL operations including model training, data handling, model management, and inference.
- Collaborate closely with ML scientists and software engineers to understand current challenges and requirements and ensure that the DL model development pipelines you create are perfectly aligned with scientific goals and operational needs.
- Continuously monitor, evaluate, and optimize DL model training pipelines for performance and scalability.
- Stay up to date with the latest advancements in AI, ML, and related technologies, and quickly learn and adapt new tools and frameworks, if necessary.
- Develop and maintain robust and reproducible DL pipelines that guarantee that DL pipelines can be reliably executed, maintaining consistency and accuracy of results.
- Drive performance improvements across our stack through profiling, optimization, and benchmarking. Implement efficient caching solutions and debug distributed systems to accelerate both training and evaluation pipelines.
- Act as a bridge facilitating communication between the engineering and scientific teams, documenting and sharing best practices to foster a culture of learning and continuous improvement.
What We're Looking For
- MS or equivalent experience in a relevant, quantitative field such as Computer Science, Statistics, Mathematics, Software Engineering, with an emphasis on AI/ML theory and/or practical development.
- 5+ years of post-MS industry experience working on developing AI/ML software engineering pipelines.
- Proficiency in a general-purpose programming language: Python (preferred), Java, Julia, C, C++, etc.
- Strong knowledge of ML and DL fundamentals and hands-on experience with machine learning frameworks such as PyTorch, TensorFlow, Jax or Scikit-learn.
- In-depth knowledge of scalable and distributed computing platforms that support complex model training (such as Ray or DeepSpeed) and their integration with ML developer tools like TensorBoard, Wandb, or MLflow.
- Experience with cloud platforms (e.g., AWS, Google Cloud, Azure) and how to deploy and manage AI/ML models and pipelines in a cloud environment.
- Understanding of containerization technologies (e.g., Docker) and computing resource orchestration tools (e.g., Kubernetes) for deploying scalable ML/AI solutions.
- Proven track record of developing and optimizing workflows for training DL models, large language models (LLMs), or similar for problems with high data complexity and volume.
- Experience managing large datasets, including data storage (such as HDFS or Parquet on S3), retrieval, and efficient data processing techniques (via libraries and executors such as PyArrow and Spark).
- Proficiency in version control systems (e.g., Git) and continuous integration/continuous deployment (CI/CD) practices to maintain code quality and automate development workflows.
- Expertise in building and launching large-scale ML frameworks in a scientific environment that supports the needs of a research team.
- Excellent ability to work effectively with cross-functional teams and communicate across disciplines.
Nice to Have
- Experience working with large-scale genomics or biological datasets.
- Experience managing multimodal datasets, such as combinations of sequence, text, image, and other data.
- Experience GPU/Accelerator programming and kernel development (such as CUDA, Triton or XLA).
- Experience with infrastructure-as-code and configuration management.
- Experience cultivating MLOps and ML infrastructure best practices, especially around reliability, provisioning and monitoring.
- Strong track record of contributions to relevant DL projects, e.g. on github.
Technical Stack
- Python, Java, Julia, C, C++, PyTorch, TensorFlow, Jax, Scikit-learn, Ray, DeepSpeed, TensorBoard, Wandb, MLflow, AWS, Google Cloud, Azure, Docker, Kubernetes, HDFS, Parquet, S3, PyArrow, Spark, Git, CUDA, Triton, XLA
Team & Environment
- Interdisciplinary R&D team including machine learning scientists, computational biologists, and software engineers.
- Reporting to Director of Machine Learning Science
- Commitment to reducing cancer mortality via accessible early detection
- Interdisciplinary collaboration
- Culture of learning and continuous improvement
- Diversity and inclusion
- Equal opportunity employment
Benefits & Compensation
- Base salary range of $161,925 - $227,325
- Eligibility to receive equity
- Cash bonuses
- Full range of medical benefits
- Financial benefits
- Other benefits depending on the position offered
- Equal opportunity employer with commitment to diversity
- Compliance with Family & Medical Leave Act (FMLA)
- Equal Employment Opportunity (EEO)
- Employee Polygraph Protection Act (EPPA)
Work Mode
- Hybrid role with 2-3 days per week in office
- Remote option available
- Location: Brisbane, California
Freenome is proud to be an equal-opportunity employer, and we value diversity. Freenome does not discriminate on the basis of race, color, religion, marital status, age, national origin, ancestry, physical or mental disability, medical condition, pregnancy, genetic information, gender, sexual orientation, gender identity or expression, veteran status, or any other status protected under federal, state, or local law. Applicants have rights under Federal Employment Laws including FMLA, EEO, and EPPA.