Bangalore, India

DevRev is hiring a Senior Member of Technical Staff: ML Systems and Infrastructure

About the Role

Role details below.

Responsibilities

Architect the future of AI infrastructure by designing, building, and owning the end-to-end platform that supports the entire lifecycle of ML models—from massive-scale distributed training to ultra-low-latency, highly-available inference.
Implement and scale sophisticated inference stacks for LLMs using frameworks like vLLM, TensorRT-LLM, or SGLang.
Solve complex challenges in throughput, latency, token streaming, and automated scaling to deliver a seamless user experience.
Act as a strategic partner to AI Research and Data Science teams.
Create a seamless developer experience that accelerates the ability to experiment, fine-tune, and deploy groundbreaking models with velocity and confidence.
Develop robust CI/CD/CT (Continuous Training) pipelines using tools like Argo Workflows, ArgoCD, and GitHub Actions to automate model validation, deployment, and lifecycle management.
Ensure systems are both agile and rock-solid.

Requirements

5+ years in infrastructure or software engineering, with at least 2+ years laser-focused on MLOps or ML infrastructure for large-scale distributed systems.
Bachelor’s or Master’s degree in Computer Science, Engineering, or a related field.
Deep, hands-on expertise with Kubernetes in production.
Fluency in the cloud-native ecosystem, including Helm, ArgoCD, and Argo Workflows.
Ability to optimize the platform’s performance and scalability, considering factors such as GPU resource utilization, data ingestion, model training, and deployment.
Hands-on experience with modern LLM inference serving frameworks (e.g., vLLM, SGLang, Triton Inference Server, Ray Serve).
Understanding of the unique challenges of serving generative models.
Strong programming proficiency in Python or Go.
Experience using ML frameworks like PyTorch, Jax, TensorFlow.
Passion for building observable and resilient systems using modern monitoring tools (e.g., Prometheus, Grafana, OpenTelemetry).

Nice to Have

Deep performance optimization skills, including writing custom inference kernels in CUDA or Triton to accelerate model performance beyond what off-the-shelf frameworks provide.
Experience with model optimization techniques like quantization, distillation, and speculative decoding.
Exposure to training and serving multi-modal models (e.g., text-to-image, vision-language).
Knowledge of AI safety and evaluation frameworks for monitoring model performance for things like bias, toxicity, and hallucinations.

Additional Information

Shortlisted candidates will undergo a Background Verification (BGV).
By applying, you consent to sharing personal information required for the Background

Relocating to Thailand?

Visa and work permit handled by experts

SVBL manages your entire visa process — from application to approval. Work permits, extensions, and compliance all covered. One partner for legal, immigration, and settling in.

Work permit processing

Visa extensions & renewals

Immigration compliance

Banking & housing guidance

Get free consultation

Free initial consultation

About company

DevRev is building a world where AI thinks, acts, and works like a teammate. The company's mission is to unlock team potential through human-AI collaboration, helping teams work more efficiently by reducing repetitive tasks and fragmented workflows.

DevRev created Computer, an AI teammate that connects all your data — from structured sources like CRM systems to unstructured ones like emails and spreadsheets — into a unified, AI-ready source of truth. This enables reliable, trustworthy answers across sales, support, and operations teams.

Computer anticipates needs, reasons through context, and takes action through AI agents that automate tasks, resolve tickets, and keep systems in sync. DevRev was founded in 2020 by Dheeraj Pandey and Manoj Agarwal, drawing on their experience from Nutanix and other leading tech companies.

All jobs at DevRev Visit website

Job Details

Category other

Posted 2 hours ago