Full-time

Serve Robotics is hiring a ML Performance Engineer

About the Role

Serve Robotics is looking for an ML Performance Engineer to bridge the gap between machine learning research and real-time deployment on robotic platforms. You will be the key link enabling advanced models to run efficiently on edge hardware like NVIDIA Jetson, working closely with ML researchers, embedded systems engineers, and robotics software teams.

What You'll Do

  • Own the full lifecycle of ML model deployment on robots—from handoff to full system integration.
  • Convert, optimize, and integrate trained models using frameworks like PyTorch, ONNX, and TensorRT for Jetson platforms.
  • Develop and optimize CUDA kernels and pipelines for low-latency, high-throughput inference.
  • Profile and benchmark ML workloads using tools like Nsight, nvprof, and the TensorRT profiler.
  • Identify and remove compute and memory bottlenecks for real-time inference.
  • Design and implement strategies for quantization, pruning, and other model compression techniques.
  • Ensure models are robust to the resource constraints of real-time, low-power robotic systems.
  • Manage memory layout, concurrency, and scheduling for optimized GPU and CPU usage on Jetson devices.
  • Build benchmarking pipelines for continuous performance evaluation on hardware-in-the-loop systems.
  • Collaborate with QA and systems teams to validate model behavior in field scenarios.
  • Work closely with ML researchers to influence model architectures for edge deployability.

What We're Looking For

  • A Bachelor’s degree in Computer Science, Robotics, Electrical Engineering, or an equivalent field.
  • 3+ years of experience deploying ML models on embedded or edge platforms, preferably in robotics.
  • 2+ years of experience with CUDA, TensorRT, and other NVIDIA acceleration tools.
  • Proficiency in Python and C++ for performance-sensitive systems.
  • Hands-on experience with NVIDIA Jetson platforms and edge inference tools.
  • Familiarity with model conversion workflows like PyTorch to ONNX to TensorRT.

Nice to Have

  • A Master’s degree in Computer Science, Robotics, Electrical Engineering, or an equivalent field.
  • Experience with real-time robotics systems like ROS2 and Linux embedded systems.
  • Knowledge of performance tuning under thermal, power, and memory constraints on embedded devices.
  • Experience with techniques like INT8 quantization, sparsity, and latency-aware model design.
  • Contributions to open-source ML or CUDA projects.

Technical Stack

  • ML Frameworks: PyTorch, ONNX, TensorRT
  • Low-Level Acceleration: CUDA
  • Hardware: NVIDIA Jetson
  • Programming Languages: Python, C++
  • Robotics Middleware: ROS2

Team & Environment

You will join an agile, diverse, and driven team at Serve Robotics. We believe in solving complicated dynamic problems collaboratively and respectfully.

Required Skills
PyTorchONNXTensorRTCUDANVIDIA JetsonPythonC++ROS2Performance OptimizationModel DeploymentEmbedded SystemsDeep LearningComputer Vision
Relocating to Thailand?

Visa and work permit handled by experts

SVBL manages your entire visa process — from application to approval. Work permits, extensions, and compliance all covered. One partner for legal, immigration, and settling in.

Work permit processing
Visa extensions & renewals
Immigration compliance
Banking & housing guidance
Get free consultation
Free initial consultation
About company
Serve Robotics

Serve Robotics is reimagining how things move in cities with a personable sidewalk robot designed to take deliveries away from congested streets, make deliveries available to more people, and benefit local businesses.

Visit website
Job Details
Category embedded
Posted 7 months ago