EMEA Remote (Global) Employment

Pragmatike is hiring a ML Ops Engineer (EMEA Remote)

About the Role

ML Ops Engineer responsible for building and operating reliable, efficient ML inference systems for a fast-growing distributed cloud infrastructure startup. Focuses on model serving, GPU-based workloads, and decentralized AI/ML architectures in a remote-first EMEA setup.

Responsibilities

Develop and manage production-level model serving infrastructure using tools like vLLM, TGI, Triton, or similar frameworks
Create deployment pipelines with blue/green and canary release strategies tailored for ML models
Build and sustain auto-scaling mechanisms, multi-model serving setups, and intelligent routing for inference requests
Improve efficiency in GPU usage, memory management, network performance, and model storage systems
Design monitoring solutions to track inference latency, throughput, GPU utilization, cost, and system health
Maintain model registries and implement CI/CD pipelines for automated, reproducible model deployments
Oversee end-to-end ML system lifecycle, including development, production deployment, and on-call support
Establish engineering standards and support platform scalability within a fast-paced startup environment

Requirements

Minimum of 4 years in ML Ops, Platform Engineering, SRE, or related infrastructure roles with focus on ML systems
Direct experience with model serving technologies such as vLLM, TGI, Triton, or comparable tools
Solid background in managing containerized GPU workloads in production using orchestration platforms
Proven experience with MLOps tooling including model registries, experiment tracking, and automated deployment systems
Proficient in Python and infrastructure-as-code tools such as Terraform, Helm, or equivalent
Strong grasp of distributed systems, performance optimization, and reliability engineering in production environments
Ability to leverage AI coding assistants effectively to speed up development and debugging tasks
Demonstrated ownership and ability to work independently in a remote-first setting

Nice to Have

Experience working with ML platforms like Kubeflow, MLflow, or KubeAI
Knowledge of GPU scheduling, CUDA/ROCm optimization, or multi-tenant inference architectures
Track record in optimizing costs across different GPU types and inference workloads
Background in early-stage startups or building greenfield infrastructure projects
Proven ability to design and implement production systems from scratch rather than maintaining legacy systems

Tech Stack

vLLM, TGI, Triton, Python, Terraform, Helm, Kubernetes, GPU-based workloads, CUDA, ROCm, CI/CD, model registries, experiment tracking, distributed systems, container orchestration

Benefits

Lead critical infrastructure development for a rapidly expanding AI-native cloud platform
Design and implement foundational ML inference systems from the ground up in a high-growth environment
Work at the intersection of distributed systems, GPU computing, and sustainable cloud architecture
Develop deep expertise in next-generation AI infrastructure and large-scale model serving
Shape core engineering decisions and define scalable best practices for the organization

Work Arrangement

global — EMEA — Fully remote with focus on EMEA timezone

Team

Distributed team within a fast-scaling startup; collaborates with infrastructure, platform, and applied AI teams

Fair, transparent, and inclusive recruitment process
No discrimination based on age, disability, gender, gender identity or expression, marital or civil partner status, pregnancy or maternity, race, religion or belief, sex, or sexual orientation

Additional Information

Location: Fully remote (EMEA timezone)
Start date: ASAP
Languages: Fluent English required
Industry: Cloud Computing / AI / European Deep-Tech SaaS
Personal data will be processed lawfully, fairly, and securely under GDPR for recruitment purposes only
Role includes on-call responsibilities
Remote-first environment demands independent operation and strong ownership

Required Skills

vLLMTGITritonPythonTerraformHelmKubernetesGPU-based workloadsCUDAROCmCI/CDmodel registriesexperiment trackingdistributed systemscontainer orchestration vLLMTGITritonPythonTerraformHelmKubernetesGPU-based workloadsCUDAROCmCI/CDmodel registriesexperiment trackingdistributed systemscontainer orchestration

Your first international client?

Don't lose them over invoicing

Clients ghost freelancers with unprofessional invoicing. Glopay gives you a real EU company partnership so they take you seriously from invoice #1.

Instant EU company partnership

Invoice builder with your branding

Automated payment reminders

Real-time payment tracking

Get EU company now

Ready in 24 hours

About company

Pragmatike is the largest remote tech job platform, connecting businesses with vetted remote tech talent and helping tech professionals find jobs at tech companies. The company specializes in fast, human-driven talent matching, offering services to hire top-tier developers, product managers, and designers within 48 hours.

All jobs at Pragmatike Visit website

Job Details

Department Work with our Clients

Category infrastructure

Posted 9 days ago

Similar Jobs

Other opportunities you might be interested in

KTO - Platform Engineer - SRE - Lever

KTO

Porto Alegre Remote (Country)

Senior Infrastructure Engineer

Ema

Silicon Valley Hybrid

Senior Site Reliability Engineer (Remote or NYC - Hybrid)

Perchwell

New York Hybrid

Senior DevOps Engineer (hiring in US/CAN & LATAM)

TrueML

Remote in Mexico Remote (Global)

Lead Engineer – Platform & Infrastructure

Stream

Amsterdam Hybrid

Infrastructure Engineer (Experienced/Senior+ Levels)

SentiLink

Austin Hybrid

Insights related to this role

Developer working on a dual-monitor setup running CI/CD testing tools for DevOps automation and continuous integration workflows.

Technology

CI/CD Testing Tools: 23 Best Options for 2026

Choosing the right CI/CD testing tools is critical for fast, safe deployments. We evaluate 23 leading tools across integration, scalability, and cost to help teams optimize their DevOps workflows. From Jenkins to Trivy, find the best fit for your pipeline.

5 min 18 days ago

Home office setup with dual monitors showing Kubernetes dashboards, representing the rise of Kubernetes remote jobs in AI and cloud-native careers 2026.

Job Search

Kubernetes Remote Jobs: AI & Cloud-Native Careers in 2026

As AI reshapes infrastructure, Kubernetes remote jobs are surging in demand. With 66% of generative AI inference running on Kubernetes, cloud-native careers are shifting toward platform engineering, observability, and remote-first roles across Europe and globally.

5 min a month ago

A remote developer working in a well-lit, modern workspace, illustrating a productive environment enabled by a developer experience platform.

Remote Work

Developer Experience Platform: Lessons from Europe

A platform engineering team in Europe shares practical lessons on building a developer experience platform that empowers developers, improves productivity, and bridges silos. Learn what worked, what didn’t, and how they measured success.

5 min 15 days ago