Responsibilities
- Set up, customize, and verify GPU-powered computing clusters optimized for artificial intelligence, machine learning, and high-performance computing, using NVIDIA Base Command Manager; familiarity with Warewulf and OpenHPC is beneficial
- Conduct performance testing using benchmarking tools such as HPL GPU, HPL MxP, STREAM, NCCL, RCCL, and OSU Microbenchmarks
- Generate detailed deployment documentation, performance analysis reports, and disseminate proven optimization techniques across the engineering team
- Install, configure, and harden Linux distributions including RHEL, Ubuntu, and Rocky for GenAI and HPC applications
- Engage directly with clients at their locations, involving regional and nationwide travel
Work Arrangement
Hybrid


