NVIDIA is looking for a Senior Software Engineer to improve our HPC infrastructure for business-critical services and AI applications. You will join a team building and operating sophisticated, cloud-native systems using modern distributed systems patterns.

What You'll Do

Apply modern distributed systems patterns to push the limits of scale, latency, and reliability.
Continuously improve infrastructure provisioning and operations with automation, APIs, and self‑service platforms.
Operate in a globally distributed, hybrid multi‑cloud environment (AWS, GCP, on‑prem), building systems that are cloud‑native and location‑agnostic.
Build strong cross-functional relationships and align with collaborators across various business units.
Improve uptime and Quality of Service (QoS) through data-driven operations, strong SLOs, and robust incident practices.
Participate in the team’s on‑call rotation and lead high‑impact incident response when needed.

What We're Looking For

Strong coding skills in at least two of: Go, Java, C/C++, Scala, Python, Elixir, with a focus on backend, systems, or infrastructure engineering.
Deep understanding of scalability, consistency, and performance trade‑offs in server‑side systems; ability to build horizontally scalable, resilient, and low‑latency services.
Experience owning services end‑to‑end: architecture, build reviews, implementation, testing, rollout, observability, and iterative improvement.
Hands‑on experience with at least one major cloud provider (GCP, AWS, or Azure) and cloud‑native primitives (managed storage, messaging, compute).
Proficiency with modern CI/CD, GitOps workflows, and Infrastructure as Code practices for safe, repeatable changes.
Bias for action, strong problem‑solving skills, and a track record of simplifying complex systems.
B.S. in Computer Science or related field (or equivalent experience), with 5+ years of relevant experience.
Careful communication and collaboration skills; comfortable guiding technical decisions across teams.

Nice to Have

Prior experience building core infrastructure or control planes for HPC clusters, large-scale AI/ML platforms, or systems managed by job schedulers (e.g., Slurm or Kubernetes).
Maintainer or co‑maintainer responsibilities for an open source component used in production (plugins, operators, exporters, controllers, or SDKs) at large scale.

Technical Stack

Languages: Go, Java, C/C++, Scala, Python, Elixir
Cloud: AWS, GCP, Azure

Benefits & Compensation

Compensation Range: $152,000 USD - $241,500 USD
Equity
Comprehensive benefits package

Work Mode

This role follows a hybrid work model.

NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer.

NVIDIA is hiring a Software Engineer

What You'll Do

What We're Looking For

Nice to Have

Technical Stack

Benefits & Compensation

Work Mode

Similar Jobs

Software Engineer - Over-the-Air Updates

Senior Software Engineer

Senior Software Engineer - Careers at Red Hat

Senior Software Engineer (Python)

Senior Software Engineer, Cloud Platform

Senior Software Engineer, Devices - Türkiye

Related Articles

Network Configuration as Code: CI/CD for Automation | NVIDIA

Become an AI Developer: Your Career Guide

CI/CD Testing Tools: 23 Best Options for 2026