About the Role
As a Senior Solutions Architect specializing in networking with a focus on Ethernet and InfiniBand technologies, you will play a key role in shaping the infrastructure behind advanced AI and high-performance computing (HPC) systems. You'll work directly with customers, partners, and engineering teams to design, deploy, and optimize scalable networking solutions that meet demanding performance requirements.
Key Responsibilities
- Design and implement networking architectures for AI and HPC environments at scale
- Support the full lifecycle of network services—from planning and deployment to ongoing operations and optimization
- Ensure system reliability by monitoring latency, availability, and overall health of network infrastructure
- Diagnose and resolve complex networking issues across LAN and InfiniBand fabrics
- Collaborate with internal teams to report bugs, document solutions, and recommend product improvements
- Lead customer engagements to understand technical requirements and deliver effective networking strategies
- Automate network provisioning and operations using modern DevOps practices
Required Qualifications
- Degree in Computer Science, Engineering, Physics, Mathematics, or a related technical field, or equivalent practical experience
- Minimum of 8 years in networking roles with solid understanding of TCP/IP, data center design, and network protocols
- Proven experience configuring and troubleshooting large-scale HPC or AI networks using Ethernet and InfiniBand
- Strong command of routing and switching protocols including BGP, OSPF, EVPN, and VXLAN
- Hands-on experience with network operating systems such as Cumulus Linux, SONiC, JunosOS, IOS, or EOS
- Proficiency in automation tools like Ansible, Salt, and scripting in Python
- Experience building CI/CD pipelines for network infrastructure
- Excellent communication skills and a customer-focused mindset
- Ability to lead technical discussions and work effectively across teams
Preferred Background
- Exposure to public cloud networking environments (AWS, GCP, Azure)
- Industry-recognized networking or Linux certifications
- Experience with HPC job schedulers such as Slurm or PBS
- Familiarity with storage technologies like Luster and tools such as Base Command Manager (BCM)
- Knowledge of GPU-based systems and their networking requirements
