About the Role
The engineer will bridge machine learning and infrastructure, enabling scalable model deployment, maintaining CI/CD pipelines, and improving system observability and performance.
Responsibilities
- Design and manage scalable infrastructure for machine learning models
- Implement and maintain CI/CD pipelines for ML systems
- Ensure reliable deployment, monitoring, and rollback of models
- Optimize training and inference pipelines for efficiency
- Collaborate with data scientists to productionize models
- Maintain system observability using logging and metrics tools
- Automate operational workflows to reduce manual intervention
- Support model versioning and experiment tracking
- Troubleshoot performance issues in distributed systems
- Enforce security and compliance standards in ML infrastructure
- Improve reliability and uptime of ML services
- Integrate models into real-time data pipelines
- Manage dependencies and environments across development stages
- Monitor resource utilization and scale infrastructure as needed
- Document system architecture and operational procedures
Nice to Have
- Experience with large-scale recommendation systems
- Background in real-time bidding or ad tech platforms
- Knowledge of feature store implementations
- Familiarity with model monitoring and drift detection
- Experience with serverless ML deployments
Benefits
- Flexible working hours
- Remote work options
- Health insurance coverage
- Professional development budget
- Paid time off and holidays
- Team retreats and company events
- Access to learning resources
- Stock options or equity participation
- Modern development tools and equipment
- Supportive and inclusive work culture
Compensation
Competitive salary based on experience and location
Work Arrangement
Remote
Team
Cross-functional team focused on data science and platform reliability
About the Role
This position plays a key role in maintaining the reliability and scalability of machine learning services. The engineer will work closely with research and engineering teams to transition models from experimentation to production.
Technology Stack
The team uses Kubernetes for orchestration, Docker for containerization, Terraform for infrastructure provisioning, and a mix of open-source and proprietary tools for model monitoring and deployment.
Available for selected candidates