NVIDIA is hiring a Senior Distributed Storage Engineer - DGX Cloud

About the Role

Design and enhance distributed storage solutions to support AI workloads in cloud environments with a focus on scalability, efficiency, and system integration.

Responsibilities

  • Develop and maintain high-performance distributed storage systems
  • Optimize storage architectures for AI and machine learning workloads
  • Collaborate with cloud infrastructure teams to integrate storage solutions
  • Diagnose and resolve performance bottlenecks in storage subsystems
  • Ensure data consistency, durability, and fault tolerance
  • Implement monitoring and alerting for storage health and performance
  • Work closely with networking and compute teams for end-to-end optimization
  • Design for multi-region and cloud-native deployment models
  • Contribute to capacity planning and scalability forecasting
  • Support deployment automation and infrastructure as code practices
  • Troubleshoot complex distributed system issues
  • Improve data replication and synchronization mechanisms
  • Evaluate new storage technologies and protocols
  • Ensure compliance with security and access control standards
  • Participate in system design reviews and technical documentation
  • Drive improvements in system availability and recovery processes
  • Collaborate on disaster recovery strategies for storage layers
  • Optimize cost-performance tradeoffs in storage infrastructure
  • Support CI/CD pipelines for storage-related services
  • Mentor junior engineers and contribute to team knowledge sharing

Compensation

Competitive salary, equity, and comprehensive benefits package

Work Arrangement

Hybrid work model with office and remote flexibility

Team

Part of a high-performance engineering team building cloud infrastructure for AI and deep learning

About DGX Cloud

DGX Cloud is a secure, cloud-based AI supercomputing service that provides instant access to powerful GPU resources and integrated software stack for enterprise AI development.

What We Offer

  • Opportunity to work on cutting-edge storage challenges at scale
  • Access to advanced GPU-accelerated computing environments
  • Collaboration with world-class engineers and researchers

Available for qualified candidates requiring sponsorship

Required Skills
KubernetesGolangPythonCloud Service Provider integrationsDistributed Storage SystemsDistributed SystemsPerformance OptimizationSoftware EngineeringCloud InfrastructureNetworkingLinuxCI/CDAutomation KubernetesGolangPythonCloud Service Provider integrationsDistributed Storage SystemsDistributed SystemsPerformance OptimizationSoftware EngineeringCloud InfrastructureNetworkingLinuxCI/CDAutomation
About company
NVIDIA
NVIDIA builds accelerated computing platforms and AI technologies that power advancements in areas such as generative AI, data centers, robotics, and digital twins.
All jobs at NVIDIA Visit website
Job Details
Category other
Posted 9 months ago