About the Role
This role involves owning and evolving cloud infrastructure to support a high-performance platform. The engineer will implement scalable systems, enforce best practices in reliability and security, and collaborate across teams to deliver resilient services.
Responsibilities
- Design and manage cloud-based infrastructure at scale
- Ensure systems are highly available and fault-tolerant
- Implement and maintain infrastructure as code
- Optimize cloud costs while maintaining performance
- Support incident response and root cause analysis
- Collaborate with engineering teams on service deployment
- Enforce security standards across infrastructure layers
- Monitor system performance and proactively address issues
- Improve deployment pipelines and CI/CD workflows
- Drive automation to reduce manual operational tasks
- Maintain compliance with security and regulatory standards
- Evaluate and integrate new cloud technologies
- Document architecture and operational procedures
- Mentor engineers on infrastructure best practices
- Participate in on-call rotations for critical systems
- Troubleshoot complex distributed system failures
- Ensure disaster recovery plans are tested and effective
- Scale infrastructure in response to product growth
- Work with observability tools to enhance monitoring
- Promote a culture of operational excellence
- Support global service delivery with low latency
- Manage identity and access controls in cloud environments
- Integrate secrets and configuration management securely
- Collaborate on capacity planning initiatives
- Contribute to postmortem reviews and action follow-ups
Nice to Have
- Experience with multi-cloud infrastructure
- Contributions to open-source infrastructure projects
- Advanced degree in a technical field
- Certifications in cloud platforms
- Experience with edge computing architectures
- Background in SaaS platform operations
- Knowledge of service mesh technologies
- Familiarity with zero-trust security models
- Experience leading infrastructure initiatives
- Public speaking or conference presentations
Compensation
Competitive compensation based on experience and location
Work Arrangement
Remote (Worldwide)
Team
Part of a distributed engineering team focused on scalable infrastructure
Tech Stack
- Primary cloud providers include AWS and GCP
- Infrastructure managed using Terraform
- CI/CD powered by GitHub Actions and ArgoCD
- Kubernetes used for container orchestration
- Monitoring via Prometheus and Grafana
- Logging with Fluent Bit and Elasticsearch
- Secrets managed with HashiCorp Vault
- DNS and edge routing through Cloudflare
Engineering Culture
- Emphasis on autonomy and ownership
- Data-driven decision-making across teams
- Regular postmortems with blameless reviews
- Quarterly hack weeks for innovation
- Transparent roadmaps and technical planning
- Cross-functional collaboration encouraged
- Commitment to inclusive documentation
- Active participation in code reviews
Visa sponsorship available for eligible candidates