About the Role
Lead cloud infrastructure strategy and mentor engineers while designing scalable, secure, and cost-efficient systems using OpenTofu, Kubernetes, and GitOps. Drive multi-cloud architecture and platform decisions in an AI-native environment focused on innovation and operational excellence.
Responsibilities
- Collaborate with internal teams to design cloud-native applications and infrastructure, prioritizing scalability, cost efficiency, and security
- Develop and maintain infrastructure as code using OpenTofu and automation tools to enforce zero-console operations
- Operate and optimize large-scale Kubernetes clusters for reliability, performance, and security
- Lead cloud cost optimization initiatives, balancing resource usage with financial efficiency across environments
- Help shape the company's multi-cloud and multi-tenant architecture, ensuring compliance and long-term scalability
- Guide engineering teams through mentorship, knowledge sharing, and influence on platform design and best practices
Requirements
- Ten years of experience in technical roles, including at least five years focused on platform, infrastructure, or DevOps engineering
- Demonstrated experience in leadership or mentoring junior engineers
- Extensive expertise with Terraform or OpenTofu for infrastructure provisioning in console-free environments
- Production-level experience managing Kubernetes clusters, including upgrades, scaling, and issue resolution
- Proficiency with Argo CD or comparable GitOps tools for continuous delivery pipelines
- Proven ability to implement cost-saving strategies in cloud-native environments on AWS, GCP, or Azure
- Solid understanding of CI/CD pipelines, monitoring systems, observability practices, and incident response
- Strong coding skills in at least one language such as Go, Python, or similar scripting languages
- Bachelor’s or master’s degree in Computer Science or a related field, or equivalent real-world experience
Nice to Have
- Experience implementing and managing service mesh technologies like Istio or Linkerd for secure inter-service communication
- Background operating infrastructure across multiple cloud providers and hybrid environments
- Familiarity with regulatory compliance standards such as SOC 2, HIPAA, ISO 27001, or FedRAMP
- Active contributions to open-source projects related to Kubernetes, Argo CD, or Terraform
- Knowledge of advanced cloud cost management platforms and FinOps methodologies
Tech Stack
OpenTofu, Terraform, Kubernetes, Argo CD, AWS, GCP, Azure, GitOps, CI/CD, Monitoring and Observability Tools, Go, Python, Service Mesh (Istio, Linkerd), FinOps Tools
Benefits
- Shape an AI-native platform that is redefining SaaS for the AI era
- Work in a culture that values deep platform thinking and delivers high-quality conversational AI and agentic systems
- Grow professionally by tackling complex challenges in AI, distributed systems, and cloud infrastructure
- Be part of a fast-growing organization with a bold mission and measurable impact
Compensation
Competitive salary and equity package
Work Arrangement
Hybrid or remote with glo
Company Culture
- Innovation-focused environment
- Emphasis on platform thinking and system design
- Commitment to solving complex technical challenges
- Support for continuous learning and professional growth
- Collaborative engineering culture
Additional Information
- The company is an equal opportunity employer that values diversity and inclusion in the workplace
- Headquarters located in Palo Alto, California, with seven global office locations