Own the design and operation of a robust, secure cloud infrastructure supporting a high-traffic SaaS platform. You'll leverage Google Cloud Platform and AWS to build systems that are reliable, performant, and resilient at scale, serving organizations worldwide.
What You'll Do
- Design and manage infrastructure across 40+ environments using Terraform, enforcing consistency and repeatability
- Operate and optimize GCP services including GKE, Cloud Run, Cloud Functions, VPCs, and Cloud Load Balancers
- Develop modular, reusable Terraform code and manage state workflows via Terraform Cloud
- Administer MongoDB Atlas clusters with attention to replication, peering, and backup strategies
- Manage Redis deployments for caching and real-time functionality using Cloud Memorystore
- Configure BigQuery pipelines, scheduled queries, and data retention policies
- Implement cost-effective storage lifecycle rules across Cloud Storage buckets
- Build and maintain CI/CD pipelines using GitHub Actions and Cloud Build with OIDC integration
- Automate container builds and manage registries via GCR and Artifact Registry
- Write automation scripts in Python and Bash to streamline operations
- Enforce code quality through pre-commit hooks and validation workflows
- Monitor system health using Google Cloud Monitoring and Sentry, defining meaningful alerting policies
- Implement distributed tracing with Cloud Trace and OpenTelemetry for performance insights
- Structure logs using Cloud Logging and derive metrics from log events
- Respond to incidents, conduct root cause analysis, and apply corrective actions
- Manage SLOs and uptime checks to ensure service reliability
- Configure Cloud Armor WAF rules, rate limiting, and DDoS protections
- Secure access to internal tools using Identity-Aware Proxy (IAP)
- Automate certificate management with Let's Encrypt and Google-managed SSL/TLS
- Protect secrets using Google Secret Manager with strict IAM policies
- Design secure VPC networks with private access, NAT, and service controls
- Enforce encryption, network segmentation, and least-privilege access across systems
- Partner with engineering teams to enhance application scalability and efficiency
- Contribute to architectural planning and provide infrastructure guidance
- Document configurations, procedures, and incident runbooks
- Mentor team members on infrastructure best practices and automation
- Collaborate across functions to support deployments and system improvements
Requirements
- Bachelor’s degree in Computer Science or a related technical field, or equivalent hands-on experience
- 3–5 years of direct DevOps experience managing production cloud systems
- Deep working knowledge of Google Cloud Platform, including Cloud Run, GKE, Cloud Functions, VPC, IAM, and security services
- Advanced proficiency with Terraform for complex, multi-environment infrastructure and module development
- Experience managing MongoDB Atlas deployments, including sharding, replication, and network configuration
- Strong background in container technologies such as Docker and Kubernetes or serverless platforms
- Proven experience with CI/CD systems including GitHub Actions and Cloud Build
- Familiarity with GitOps and automated deployment workflows
- Proficiency in Python and Bash for scripting and automation tasks
- Experience with monitoring tools such as Sentry and native GCP observability services
- Ability to manage log aggregation, alerting, and performance metrics
Benefits
- Work 100% remotely with an async-first culture
- Receive competitive compensation
- Enjoy over 30 days of paid time off annually
- Join an inclusive, diverse team that values different perspectives
- Be part of a mission-driven organization shaping the future of work
Work Mode
Global, fully remote role with asynchronous collaboration as the default. Work from any location on your own schedule.