What You'll Do
Drive the transformation of platform architecture by designing and maintaining resilient, cloud-native systems on AWS. You’ll shape how engineering teams deploy and manage services at scale, ensuring infrastructure is repeatable, version-controlled, and reliable through Infrastructure as Code.
Build and refine automated deployment workflows using GitHub Actions, ArgoCD, and Helm to streamline the journey from code commit to production. Manage Kubernetes environments to keep containerized applications running efficiently and securely.
Develop custom command-line tools and automation scripts in Python or Go to reduce manual overhead and empower developers. Improve system observability by tracking performance metrics and cost drivers, enabling data-backed decisions across engineering.
When issues arise, lead root cause investigations and implement durable fixes that strengthen platform resilience over time.
Requirements
You have a degree in Computer Science or a related engineering discipline, with demonstrated experience diagnosing and resolving issues in distributed systems. You’re proficient with core AWS services and understand networking concepts including DNS, HTTP, and VPCs.
Hands-on experience managing Kubernetes clusters is essential, along with fluency in Helm for configuration management. You write clean, maintainable code in Python, Go, or TypeScript, and you consistently choose automation over repetition.
You approach infrastructure with a security-first mindset, embedding best practices across every layer of the stack. You’re able to collaborate across functions, turning technical requirements into clear documentation and actionable guidance for teams.
Technical Stack
- AWS
- Terraform (IaC)
- GitHub Actions
- ArgoCD
- Helm
- Kubernetes
- Python, Go, TypeScript
- CI/CD & Containerization
- Cloud-native architecture


