Germany, Remote, Switzerland, Zurich, Germany, Munich, Germany, Berlin remote Full-time

NVIDIA is hiring a Senior HPC and AI Networking Performance Research and Analysis Engineer

About the Role

As a Senior HPC and AI Networking Performance Research and Analysis Engineer, you will investigate and enhance the performance of AI workloads running on extensive GPU and CPU systems. Your primary focus will be on distributed deep learning applications, particularly large language model training and inference, where communication patterns and network efficiency play a critical role.

Key Responsibilities

Conduct in-depth profiling and analysis of AI workloads to uncover performance bottlenecks, especially in communication and data transfer layers
Design and execute benchmarking strategies to evaluate system behavior under real-world conditions
Collaborate with hardware and software teams to assess performance across CPUs, GPUs, host channel adapters, and network switches
Develop and apply simulation models, performance tools, and analytical methods to diagnose system limitations
Investigate low-level system interactions to determine root causes of performance issues
Establish performance baselines and define testing strategies for emerging technologies
Guide optimization efforts to achieve maximum system throughput and efficiency

Qualifications

Applicants should hold a Bachelor's degree in Computer Science or Software Engineering and bring at least six years of hands-on experience in high-performance networking. Essential skills include deep familiarity with RDMA, MPI, NCCL, and networking protocols such as RoCE. Proficiency in Python, Bash, and C is required, along with strong Linux system knowledge.

Experience with NVIDIA GPUs, CUDA libraries, and deep learning frameworks like TensorFlow or PyTorch is necessary. Demonstrated ability in performance analysis, problem solving, and cross-team collaboration is essential.

Preferred Background

Proven track record in benchmarking AI workloads, especially for distributed LLM training
Strong understanding of CUDA and NCCL internals
Comprehensive knowledge of system architecture, including CPUs (Intel, AMD, ARM), GPUs, memory, and PCI subsystems
Familiarity with congestion control mechanisms in high-speed networks

Required Skills

RDMAMPINCCLRoCECUDATensorFlowPyTorchPythonBashCPerformance AnalysisDistributed Deep LearningCollective CommunicationHPCNetworking RDMAMPINCCLRoCECUDATensorFlowPyTorchPythonBashCPerformance AnalysisDistributed Deep LearningCollective CommunicationHPCNetworking

Planning long-term in Thailand?

Full relocation support, start to finish

From visa strategy to housing, banking, and schools for your family — SVBL plans and manages every detail of your move to Thailand so nothing falls through the cracks.

Complete relocation planning

Family visa & school enrollment

Banking & insurance setup

Cultural integration support

Plan your move

One partner for everything

About company

NVIDIA builds accelerated computing platforms and AI technologies that power advancements in areas such as generative AI, data centers, robotics, and digital twins.

All jobs at NVIDIA Visit website

Job Details

Department Performance group

Category data

Posted 2 months ago

Similar Jobs

Other opportunities you might be interested in

Machine Learning Engineer III

Workday

Boulder Hybrid

AI Research Engineer

Plumerai

London Hybrid

Solution Architect, Energy

NVIDIA

Munich Hybrid

Senior Data Scientist - Downstream Demand Forecast - Value Chain (f/m/d)

Decathlon Digital

Lille Hybrid

Solution Architect – Digital Biology

NVIDIA

Germany, Remote Remote (Country)

AI Engineer, Email CRM

Future plc

United Kingdom Remote (Country)

Insights related to this role

Workspace setup for an AI developer, showing dual monitors with code and neural networks, symbolizing the AI developer career path.

Career Growth

Become an AI Developer: Your Career Guide

AI is transforming industries from healthcare to finance. If you're wondering how to become an AI developer, this guide covers the essential skills, education options, and career paths to help you enter this high-demand field—whether you're starting from scratch or transitioning from another role.

5 min 13 days ago