What You'll Do
Design and manage cloud infrastructure that supports software development, data processing, and analytics workflows at scale. You'll implement infrastructure as code using Terraform to maintain consistency and prevent configuration drift across environments.
Build and maintain Kubernetes platforms, including deployment tools like Argo CD and Kargo, to streamline application delivery. Develop and optimize CI/CD pipelines that enable reliable, automated deployments across production systems.
Create monitoring solutions and dashboards using tools such as Prometheus and Grafana to improve visibility into system performance and reliability. Respond to operational alerts, conduct root cause analysis, and drive improvements based on incident findings.
Collaborate with security and SRE teams to enforce access controls, manage cloud identities, and maintain secure configurations. Support data infrastructure including Airflow workflows, Hadoop-based systems, and enterprise databases such as Postgres, Aurora, and Redshift.
Write automation scripts in Python and shell to streamline operational tasks and improve efficiency. Work cross-functionally to support platform stability, troubleshoot complex issues, and enhance system resilience.
Requirements
- Minimum of 4 years in infrastructure, DevOps, or systems engineering roles
- Proven experience with AWS services, VPCs, subnets, DNS, load balancers, and security groups
- Strong proficiency in Terraform for infrastructure automation
- Hands-on experience with cloud identity and access management
- Familiarity with CI/CD pipelines and DevOps tooling
- Ability to write and maintain Python scripts and shell scripts for automation
- Solid troubleshooting skills with experience in root cause analysis
- Experience supporting production systems with high uptime requirements
- Understanding of observability practices and monitoring platforms
Preferred Qualifications
- Direct experience managing Kubernetes and Docker environments
- Background operating large-scale databases and data platforms
- Exposure to Hadoop ecosystems and collaboration with data teams
- Experience maintaining Airflow workflows
- Track record of cost-conscious infrastructure decisions
Benefits
- Employer-paid health insurance for employees and dependents
- 401(k) plan with employer match (or equivalent for non-US roles)
- Flexible paid time off policy
- Regular in-person company gatherings
- Home office stipend
- Additional perks available


