About the Role
Role details below.
Responsibilities
- Design, deploy, and maintain cloud infrastructure across GCP, Azure, and AWS
- Manage and optimize compute environments including Virtual Machines, Cloud Run, and container orchestration platforms
- Architect scalable, multi-region solutions with high availability, redundancy, and strong security practices
- Manage and monitor cloud storage solutions such as GCP Cloud Storage, Azure Blob Storage, and AWS S3
- Build, deploy, and manage containers using Docker
- Deploy and manage services on Kubernetes (GKE / AKS / EKS)
- Implement auto-scaling strategies, load balancing, and autoschedulers
- Optimize resource utilization and cost efficiency across environments
- Build and maintain CI/CD pipelines (GitHub Actions, GitLab, or similar)
- Automate deployments, updates, rollbacks, and environment provisioning
- Create infrastructure-as-code using tools like Terraform or Pulumi
- Push the platform toward full automation and self-healing systems
- Configure VPCs, VPNs, firewalls, DNS, subnets, and secure routing
- Ensure secure API communication between the Next.js app and backend services
- Implement monitoring, logging, and alerting systems (Prometheus, Grafana, Cloud Logging, Azure Monitor, CloudWatch, etc.)
- Ensure compliance, data protection, and identity management across clouds
- Work with Firebase Authentication, secure API gateways, and IAM policies
- Collaborate with engineers to support rapid iteration
- Optimize performance of server environments running in Cloud Run or VMs
- Implement failover systems, disaster recovery, and backup strategies
- Ensure reliable operation of services including SES / SendGrid, databases, and microservices
- Work closely with backend, AI, and product teams to support rapid iteration
- Establish DevOps best practices, guidelines, and internal documentation
- Conduct root-cause analysis, implement fixes, and prevent future incidents
- Help shape the engineering culture by improving processes, scalability, and system observability
Requirements
- Proven experience as a DevOps Engineer, Cloud Engineer, or Site Reliability Engineer
- Strong experience with GCP, Azure, and/or AWS (multi-cloud experience is a huge plus)
- In-depth knowledge of Kubernetes, Docker, Cloud Run, and container-based application delivery
- Experience with CI/CD tooling and automation frameworks
- Solid understanding of networking, firewalls, DNS, and cloud security models
- Experience with VM deployment, Cloud Run (or similar), autoscaling, and infra optimization
- Strong understanding of cloud storage solutions (Blob, S3, Cloud Storage)
- Understanding of API architecture, microservices, and distributed systems
- Experience with Terraform, Pulumi, or other infrastructure-as-code tools
- You’re a problem-solver who thrives in fast-moving environments
- You’re comfortable owning systems end-to-end and improving processes proactively
- Y
Compensation
Competitive salary. Equity: true. Stock options
Work Arrangement
global