Extreme Networks is hiring a Cloud Operations Engineer to help lead infrastructure engineering for ExtremeCloud, our multi-cloud SaaS platform. In this role, you will design, build, and operate large-scale, multi-region Kubernetes environments across AWS, GCP, Azure, and on-prem to drive reliability, scalability, and operational excellence.
What You'll Do
- Architect and scale multi-cluster, multi-region Kubernetes deployments using EKS, GKE, and AKS.
- Take end-to-end ownership of production infrastructure, including incident response and postmortems.
- Build and maintain Terraform modules for complex infrastructure patterns using GitOps principles.
- Design and optimize ArgoCD ApplicationSets and Helm chart architectures for automated deployments.
- Analyze system performance, identify bottlenecks, and implement optimizations to improve SLOs.
- Build and enhance monitoring, alerting, and observability using Prometheus, Grafana, Loki, and custom tooling.
- Implement security controls, compliance frameworks, and best practices across cloud infrastructure.
- Mentor engineers, establish best practices, and drive technical decisions.
What We're Looking For
- 5+ years in cloud infrastructure engineering, with deep expertise in at least one major cloud provider (AWS preferred).
- Strong Kubernetes experience: cluster design, operators, controllers, and multi-cluster management.
- Proficiency with Infrastructure as Code: Terraform, CloudFormation, or similar.
- GitOps expertise: ArgoCD, Flux, or similar; experience with ApplicationSets and complex deployment patterns.
- Deep Linux and networking knowledge.
- Experience with distributed systems: Elasticsearch, PostgreSQL, Redis, Kafka, RabbitMQ.
- Experience with monitoring and observability: Prometheus, Grafana, ELK stack, or similar.
- Strong problem-solving skills and experience debugging complex distributed systems.
- Experience with cloud security, compliance (SOC2, ISO27001), and secure-by-design practices.
- Excellent communication skills for working across time zones and with distributed teams.
- Self-directed with a track record of owning problems end-to-end.
Nice to Have
- Experience with multi-cloud architectures and cloud-agnostic patterns.
- Contributions to open-source infrastructure projects.
- Experience with service mesh technologies (Istio, Linkerd).
- Knowledge of chaos engineering and reliability testing.
- Experience with cost optimization and FinOps practices.
Technical Stack
- Container Platform: Kubernetes
- Cloud Providers: AWS, GCP, Azure
- IaC: Terraform, CloudFormation
- GitOps: ArgoCD, Flux
- OS/Networking: Linux
- Distributed Systems: Elasticsearch, PostgreSQL, Redis, Kafka, RabbitMQ
- Monitoring/Observability: Prometheus, Grafana, Loki, ELK stack
- Service Mesh: Istio, Linkerd
Extreme Networks provides equal employment opportunities to all employees and applicants for employment and prohibits discrimination and harassment of any type without regard to race, color, religion, age, sex, national origin, disability status, pregnancy, genetics, protected veteran status, sexual orientation, gender identity or expression, or any other characteristic protected by federal, state or local laws.






