Gather AI seeks a Senior ML Engineer (Ops) to own the infrastructure engine behind our machine learning platform. This is a hands-on, high-ownership role focused on the 'last mile' problem of ensuring sophisticated vision models run reliably at scale in production. You will be the primary builder and maintainer of our MLOps platform, leading the transition from manual deployments to a fully automated, enterprise-grade system.

What You'll Do

Migrate box and barcode detection pipelines to cloud infrastructure following MLOps best practices.
Build and maintain CI/CD pipelines for deployment across production and non-production environments.
Implement automated rollback, canary, and blue-green deployment strategies for ML microservices.
Build out a multi-tenant MLOps platform using tools like Prefect, ZenML, or similar orchestration frameworks.
Establish a centralized model registry and versioning system for all production assets.
Instrument observability across the ML stack — logging, metrics, and distributed tracing — to ensure reliability at scale.

What We're Looking For

6+ years of industry experience (outside academia) in ML engineering, MLOps, or infrastructure engineering.
Deep operational fluency with Kubernetes and Docker for ML workload orchestration.
Strong production-grade Python skills with a track record of hardening research code into scalable microservices.
Hands-on experience with CI/CD for ML (e.g., GitHub Actions, GitLab CI) and model serving frameworks (e.g., KServe, SageMaker, Vertex AI Endpoints).
Experience with pipeline orchestration and model lifecycle tools such as Airflow, MLflow, Kubeflow, or Flyte.
Proven ownership of production system reliability, including SRE principles, observability stacks, and automated failure safeguards.

Nice to Have

Prior experience building end-to-end MLOps pipelines (data, model, and inference) from scratch.
Domain experience in logistics, supply chain, or robotics-adjacent cloud platforms.
Familiarity with feature stores and training/serving data consistency patterns.
Experience with Infrastructure as Code tools such as Terraform.

Technical Stack

Orchestration: Kubernetes, Docker
Core Language: Python
CI/CD: GitHub Actions, GitLab CI
Model Serving: KServe, SageMaker, Vertex AI Endpoints
Pipeline Orchestration: Airflow, MLflow, Kubeflow, Flyte
MLOps Platforms: Prefect, ZenML
Infrastructure as Code: Terraform

Team & Environment

Our Engineering team builds the systems that turn cutting-edge ML research into reliable, production-grade infrastructure. We operate at the intersection of machine learning, cloud infrastructure, and real-world logistics. We value operational excellence, first-principles thinking, and the kind of engineering that makes complex systems look effortless.

Gather AI is hiring a Senior ML Engineer

What You'll Do

What We're Looking For

Nice to Have

Technical Stack

Team & Environment

Similar Jobs

Principal Data Engineer

Senior Data Engineer

Founding Senior Machine Learning Engineer

Data Engineer II - SRC - Music

Senior Solution Engineer

AI Developer - Tieto Tech Consulting (m/f/d)

Related Articles

Become an AI Developer: Your Career Guide

Kubernetes Remote Jobs: AI & Cloud-Native Careers in 2026

CI/CD Testing Tools: 23 Best Options for 2026