Jobgether is hiring a Senior MLOps Engineer to lead the design, development, and scaling of robust machine learning infrastructure that powers high-impact AI systems. You will bridge the gap between research and production, ensuring models are deployed efficiently, reliably, and at scale.
What You'll Do
- Design, build, and maintain end-to-end ML pipelines for model training, evaluation, and deployment.
- Develop and optimize infrastructure for distributed training and model serving across GPU and cloud environments.
- Implement CI/CD workflows for ML systems, including automated testing, deployment, and retraining.
- Monitor and manage model performance, drift, and data quality in production.
- Collaborate with AI researchers and engineers to productionize prototypes, ensuring reproducibility and scalability.
- Drive cost optimization and performance tuning for large-scale model training and inference.
- Contribute to internal documentation and establish best practices for MLOps processes.
What We're Looking For
- 6–10+ years of experience in software or ML engineering, with at least 3+ years focused on MLOps or ML infrastructure.
- Strong programming skills in Python, C/C++, and Bash.
- Proven experience deploying and managing ML models in production environments.
- Expertise with Docker, Kubernetes, and cloud platforms (AWS, GCP, Azure) for scalable ML systems.
- Hands-on experience with CI/CD pipelines (GitHub Actions, Jenkins, or similar).
- Familiarity with ML experiment tracking tools (MLflow, Weights & Biases, Kubeflow).
- Understanding of model versioning, reproducibility, and monitoring strategies.
- Excellent problem-solving, communication, and collaboration skills.
Nice to Have
- Experience training models from scratch and model optimization (quantization, distillation).
- Experience with infrastructure-as-code (Terraform, CloudFormation).
- Experience with distributed systems.
- Contributions to open-source projects.
Technical Stack
- Languages: Python, C/C++, Bash
- Containers & Orchestration: Docker, Kubernetes
- Cloud Platforms: AWS, GCP, Azure
- CI/CD: GitHub Actions, Jenkins
- ML Tools: MLflow, Weights & Biases, Kubeflow
- Infrastructure-as-Code: Terraform, CloudFormation
Benefits & Compensation
- Competitive Salary: $190,000–$240,000 depending on skills and experience.
- Generous PTO & Flexible Time Off.
- Comprehensive Healthcare Benefits for you and your family.
- Professional Growth: Opportunities to work on advanced ML systems and expand technical expertise.
Work Mode
This is a remote position open to candidates within the United States. It offers a flexible work environment with occasional travel for team meetings.
Jobgether is an equal opportunity employer.




