Coram AI is looking for an AI Research Engineer to deploy high-performance vision and multimodal models onto robotic platforms where latency, reliability, and hardware constraints matter. You'll work at the intersection of robotics, real-time systems, and deep learning to transform a $50B+ legacy industry by bringing the power of cutting-edge AI to real-world security and operations.
What You'll Do
- Deploy deep learning models on edge devices and in the cloud for real-time inference.
- Fine-tune models on proprietary datasets and manage dataset versioning, labeling, and evaluation.
- Write high-quality C++ or Rust code for deterministic, low-latency execution.
- Build cloud pipelines that process millions of images and video streams in near real time.
- Perform model surgery in PyTorch and TensorRT, including pruning, quantization, and graph optimization.
- Optimize GPU utilization, memory footprint, and inference throughput.
- Build and maintain middleware for real-time IPC between perception, planning, and control systems.
- Profile production systems to diagnose memory, compute, and concurrency bottlenecks.
- Design rigorous evaluation loops to measure model accuracy, latency, and robustness in field conditions.
What We're Looking For
- 3+ years of experience building robotics, perception, or real-time systems (startup or high-performance production environments strongly preferred).
- Strong experience building real-time robotics systems that span software and hardware.
- Experience deploying neural networks under strict latency constraints where milliseconds matter.
- Deep understanding of GPU memory management, batching strategies, and compute optimization.
- Solid experience with PyTorch for training, fine-tuning, and modifying models.
- Strong programming skills in C++ (preferred) with experience in Rust or Python.
- Experience deploying deep learning models in production, particularly in environments with strict latency constraints.
- Experience deploying models to edge devices or embedded systems where compute and memory resources are constrained.
- Strong debugging and profiling skills using low-level performance tools and system profilers.
- Experience building reliable, observable, and fault-tolerant production systems.
- BS, MS, or PhD in Computer Science, Robotics, Electrical Engineering, or a related technical field.
- Excellent communication skills (written and verbal) in English.
- Passion for building high-performance systems at the intersection of robotics, AI, and real-time infrastructure.
- Resilient and adaptable in challenging, fast-paced startup environments.
- Ability to work in an onsite environment.
Nice to Have
- Experience with TensorRT and ONNX is highly desirable.
- Experience with vLLM, SGLang, or high-performance LLM inference engines.
- Experience deploying multimodal models or LLMs in robotics contexts.
- Experience with distributed systems, structured logging, and observability at scale.
- Familiarity with distributed pubsub, real-time Linux, or embedded GPU platforms.
- Experience working with NVIDIA Jetson, CUDA kernels, or custom accelerators.
Technical Stack
- C++, Rust, Python
- PyTorch, TensorRT, ONNX
- CUDA, NVIDIA Jetson
Team & Environment
You will join a small, fast-moving team where every person has a voice, ships meaningful work, and helps shape the product.
Benefits & Compensation
- Competitive compensation package.
- 100% Employer-paid medical, dental, vision, and base life insurance.
- Flexible paid time off and 9 paid holidays.
- 401(k) with both Traditional and Roth options.
- Equity in a rapidly growing company.
- Referral bonuses.
- Daily team dinners and regular team off-sites to build connection and momentum.
- The latest Apple tech and unlimited tools so you can win.
- Unlimited Cursor and Claude Code credits.
- Direct exposure to our AI-native GTM machinery.
Work Mode
This is an onsite position.
Coram AI is an equal opportunity employer.



