EMEA Remote (Global) Employment

Pragmatike is hiring a ML Ops Engineer (EMEA Remote)

About the Role

ML Ops Engineer responsible for building and operating reliable, efficient ML inference systems for a fast-growing distributed cloud infrastructure startup. Focuses on model serving, GPU-based workloads, and decentralized AI/ML architectures in a remote-first EMEA setup.

Responsibilities

  • Develop and manage production-level model serving infrastructure using tools like vLLM, TGI, Triton, or similar frameworks
  • Create deployment pipelines with blue/green and canary release strategies tailored for ML models
  • Build and sustain auto-scaling mechanisms, multi-model serving setups, and intelligent routing for inference requests
  • Improve efficiency in GPU usage, memory management, network performance, and model storage systems
  • Design monitoring solutions to track inference latency, throughput, GPU utilization, cost, and system health
  • Maintain model registries and implement CI/CD pipelines for automated, reproducible model deployments
  • Oversee end-to-end ML system lifecycle, including development, production deployment, and on-call support
  • Establish engineering standards and support platform scalability within a fast-paced startup environment

Requirements

  • Minimum of 4 years in ML Ops, Platform Engineering, SRE, or related infrastructure roles with focus on ML systems
  • Direct experience with model serving technologies such as vLLM, TGI, Triton, or comparable tools
  • Solid background in managing containerized GPU workloads in production using orchestration platforms
  • Proven experience with MLOps tooling including model registries, experiment tracking, and automated deployment systems
  • Proficient in Python and infrastructure-as-code tools such as Terraform, Helm, or equivalent
  • Strong grasp of distributed systems, performance optimization, and reliability engineering in production environments
  • Ability to leverage AI coding assistants effectively to speed up development and debugging tasks
  • Demonstrated ownership and ability to work independently in a remote-first setting

Nice to Have

  • Experience working with ML platforms like Kubeflow, MLflow, or KubeAI
  • Knowledge of GPU scheduling, CUDA/ROCm optimization, or multi-tenant inference architectures
  • Track record in optimizing costs across different GPU types and inference workloads
  • Background in early-stage startups or building greenfield infrastructure projects
  • Proven ability to design and implement production systems from scratch rather than maintaining legacy systems

Tech Stack

vLLM, TGI, Triton, Python, Terraform, Helm, Kubernetes, GPU-based workloads, CUDA, ROCm, CI/CD, model registries, experiment tracking, distributed systems, container orchestration

Benefits

  • Lead critical infrastructure development for a rapidly expanding AI-native cloud platform
  • Design and implement foundational ML inference systems from the ground up in a high-growth environment
  • Work at the intersection of distributed systems, GPU computing, and sustainable cloud architecture
  • Develop deep expertise in next-generation AI infrastructure and large-scale model serving
  • Shape core engineering decisions and define scalable best practices for the organization

Work Arrangement

global — EMEA — Fully remote with focus on EMEA timezone

Team

Distributed team within a fast-scaling startup; collaborates with infrastructure, platform, and applied AI teams

  • Fair, transparent, and inclusive recruitment process
  • No discrimination based on age, disability, gender, gender identity or expression, marital or civil partner status, pregnancy or maternity, race, religion or belief, sex, or sexual orientation

Additional Information

  • Location: Fully remote (EMEA timezone)
  • Start date: ASAP
  • Languages: Fluent English required
  • Industry: Cloud Computing / AI / European Deep-Tech SaaS
  • Personal data will be processed lawfully, fairly, and securely under GDPR for recruitment purposes only
  • Role includes on-call responsibilities
  • Remote-first environment demands independent operation and strong ownership
Required Skills
vLLMTGITritonPythonTerraformHelmKubernetesGPU-based workloadsCUDAROCmCI/CDmodel registriesexperiment trackingdistributed systemscontainer orchestration vLLMTGITritonPythonTerraformHelmKubernetesGPU-based workloadsCUDAROCmCI/CDmodel registriesexperiment trackingdistributed systemscontainer orchestration
Your first international client?

Don't lose them over invoicing

Clients ghost freelancers with unprofessional invoicing. Glopay gives you a real EU company partnership so they take you seriously from invoice #1.

Instant EU company partnership
Invoice builder with your branding
Automated payment reminders
Real-time payment tracking
Get EU company now
Ready in 24 hours
About company
Pragmatike
Pragmatike is the largest remote tech job platform, connecting businesses with vetted remote tech talent and helping tech professionals find jobs at tech companies. The company specializes in fast, human-driven talent matching, offering services to hire top-tier developers, product managers, and designers within 48 hours.
All jobs at Pragmatike Visit website
Job Details
Department Work with our Clients
Category infrastructure
Posted 9 days ago