NVIDIA is looking for a Senior System Software Engineer for its AI Data Platform. You will design, build, and optimize highly scalable and reliable automation systems that ensure the peak performance and seamless deployment of NVIDIA's core software offerings across a diverse global ecosystem, directly impacting how AI models are validated and delivered.
What You'll Do
- Develop efficient infrastructure and tools for automating complex software processes.
- Implement advanced test harnesses, benchmarking frameworks, and analytical tools to rigorously characterize and optimize software and hardware performance.
- Apply deep knowledge of operating systems, kernel internals, device drivers, memory management, storage, networking, and high-speed interconnects to build and troubleshoot highly performant systems.
- Work with engineering teams to understand needs, define requirements, and deliver efficient solutions.
- Set performance goals, monitor feedback, analyze data, and make continuous improvements for system reliability.
- Contribute to defining technical strategies and roadmaps for platform automation initiatives.
What We're Looking For
- Bachelor's or equivalent experience in Computer Science, Computer Engineering, or a related technical field, or Master's degree or equivalent experience.
- 5+ years of industry experience in software development, focusing on infrastructure, distributed systems, automation, and/or performance engineering.
- Proven ability to develop robust tools and automation using programming languages such as C++, Python, or Go.
- Experience with operating system internals, device drivers, memory management, and debugging performance issues in complex compute applications.
- Experience in designing, building, and operating large-scale distributed systems, with knowledge of networking protocols, cluster management, and high-performance interconnects.
- Experience building and maintaining automated testing, benchmarking, and continuous integration/continuous deployment pipelines.
- Outstanding analytical, problem-solving, and debugging skills, with a track record of resolving complex technical challenges.
- Excellent interpersonal and communication skills, with the ability to articulate complex technical concepts to diverse audiences and collaborate effectively across teams.
Nice to Have
- Experience optimizing performance for AI/Machine Learning workloads, especially inference applications, on diverse hardware platforms.
- Prior experience building or contributing to large-scale compute infrastructure solutions in cloud environments or on-premises data centers.
- Experience with containerization and orchestration technologies, such as Docker and Kubernetes.
- Familiarity with performance profiling tools and methodologies for hardware and software systems.
- Track record of driving significant efficiency gains or architectural improvements in large-scale systems.
Technical Stack
- C++
- Python
- Go
- Docker
- Kubernetes
Benefits & Compensation
- Highly competitive salaries
- Comprehensive benefits package
NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.





