Quora is looking for a Senior Software Engineer to work on the company-wide Machine Learning Platform. You will be part of a small team solving technical problems at the intersection of Machine Learning, Distributed Systems, and High Performance Computing. This remote role focuses primarily on ML infrastructure (80%), with some involvement in model deployment (20%).
What You'll Do
- Design, develop, and maintain the core infrastructure that powers Quora's machine learning platform, ensuring high availability, scalability, and performance.
- Build scalable and reliable distributed systems for serving machine learning models.
- Optimize infrastructure performance across the ML platform, identifying and resolving bottlenecks to meet the demands of large-scale machine learning workloads.
- Collaborate with machine learning engineers to understand their infrastructure needs and provide solutions that enable them to build and deploy models efficiently.
- Contribute to the design and implementation of our next-generation machine learning infrastructure, focusing on scalability, reliability, and cost-effectiveness.
- Develop services on top of open source technologies like Kubernetes, Tensorflow, and PyTorch.
- Own business-critical infrastructure, help resolve production issues, and participate in the team-wide on-call rotation.
- Collaborate with ML engineers who use the platform, and help them be more impactful.
What We're Looking For
- Availability for meetings and impromptu communication during Quora's 'coordination hours' (Mon-Fri: 9am-3pm Pacific Time).
- Experience with building and owning end-to-end machine learning or data science-related systems.
- Experience instrumenting ML workloads for performance monitoring and efficiency.
- Experience with high performance, large scaled distributed systems.
- 4+ years of industry experience in Machine Learning, Infrastructure or related fields.
- 4+ years of experience writing production code in Python, C++, or a similar language.
- BS or MS in Computer Science, Engineering or a related technical field.
Nice to Have
- Strong communication and inter-personal skills; experience working with ML teams is a plus.
- Experience working with Kubernetes, Docker, Terraform, or other forms of containerized infrastructure.
- Hands-on experience with AWS technologies like EC2, EBS, S3, EKS.
Technical Stack
- Kubernetes
- Tensorflow
- PyTorch
- Docker
- Terraform
- AWS, EC2, EBS, S3, EKS
Team & Environment
You will be part of a small team responsible for Quora's Machine Learning Platform.
Benefits & Compensation
- Medical, dental, and vision coverage
- Equity refreshers
- Remote work reimbursement
- Paid time off
- Employee assistance programs
- Compensation varies by location:
- US: $155,656 - $225,160 USD
- Canada (Toronto/Vancouver): $199,399 - $230,748 CAD
- Canada (other): $186,105 - $215,365 CAD
- Equity component included
Work Mode
This role is remote and open to candidates in multiple countries around the world.
We are an equal opportunity employer and value diversity at our company. We do not discriminate on the basis of race, religion, color, national origin, gender, sexual orientation, age, marital status, veteran status, or disability status.



