Serve Robotics is looking for an ML Performance Engineer to bridge the gap between machine learning research and real-time deployment on robotic platforms. You will be the key link enabling advanced models to run efficiently on edge hardware like NVIDIA Jetson, working closely with ML researchers, embedded systems engineers, and robotics software teams.

What You'll Do

Own the full lifecycle of ML model deployment on robots—from handoff to full system integration.
Convert, optimize, and integrate trained models using frameworks like PyTorch, ONNX, and TensorRT for Jetson platforms.
Develop and optimize CUDA kernels and pipelines for low-latency, high-throughput inference.
Profile and benchmark ML workloads using tools like Nsight, nvprof, and the TensorRT profiler.
Identify and remove compute and memory bottlenecks for real-time inference.
Design and implement strategies for quantization, pruning, and other model compression techniques.
Ensure models are robust to the resource constraints of real-time, low-power robotic systems.
Manage memory layout, concurrency, and scheduling for optimized GPU and CPU usage on Jetson devices.
Build benchmarking pipelines for continuous performance evaluation on hardware-in-the-loop systems.
Collaborate with QA and systems teams to validate model behavior in field scenarios.
Work closely with ML researchers to influence model architectures for edge deployability.

What We're Looking For

A Bachelor’s degree in Computer Science, Robotics, Electrical Engineering, or an equivalent field.
3+ years of experience deploying ML models on embedded or edge platforms, preferably in robotics.
2+ years of experience with CUDA, TensorRT, and other NVIDIA acceleration tools.
Proficiency in Python and C++ for performance-sensitive systems.
Hands-on experience with NVIDIA Jetson platforms and edge inference tools.
Familiarity with model conversion workflows like PyTorch to ONNX to TensorRT.

Nice to Have

A Master’s degree in Computer Science, Robotics, Electrical Engineering, or an equivalent field.
Experience with real-time robotics systems like ROS2 and Linux embedded systems.
Knowledge of performance tuning under thermal, power, and memory constraints on embedded devices.
Experience with techniques like INT8 quantization, sparsity, and latency-aware model design.
Contributions to open-source ML or CUDA projects.

Technical Stack

ML Frameworks: PyTorch, ONNX, TensorRT
Low-Level Acceleration: CUDA
Hardware: NVIDIA Jetson
Programming Languages: Python, C++
Robotics Middleware: ROS2

Team & Environment

You will join an agile, diverse, and driven team at Serve Robotics. We believe in solving complicated dynamic problems collaboratively and respectfully.

Serve Robotics is hiring a ML Performance Engineer

What You'll Do

What We're Looking For

Nice to Have

Technical Stack

Team & Environment

Similar Jobs

Senior Software Engineer - Operating Systems

C++ Software Engineer for Predevelopment

Senior Software Engineer Aerial Platform

Embedded Perception Engineer

Senior Software Engineer - Autonomous Vehicles

Robotics Perception Engineer

Related Articles

Become an AI Developer: Your Career Guide

AI Training Jobs 2026: Hafnia Expands with Synthetic Data