What You'll Do
Design and manage a large-scale, multi-account AWS environment with an emphasis on reliability, scalability, and operational efficiency. Evaluate system architectures for alignment with best practices in availability, security, and performance.
Work closely with development teams to forecast capacity needs and support cloud infrastructure growth. Implement automated provisioning and configuration management using Terraform, CloudFormation, and Ansible, ensuring consistent and repeatable deployments.
Build and maintain secure access models through IAM policies, roles, and permission boundaries, adhering to least-privilege principles. Establish robust monitoring, alerting, and logging systems using CloudWatch, Prometheus, or similar tools to support rapid incident response and self-healing workflows.
Lead improvements in CI/CD pipelines using GitHub Actions and Atlantis to streamline delivery and reduce manual effort. Document system designs, operational procedures, and troubleshooting guides to support knowledge sharing and continuity.
Provide deep technical support during incidents, coordinate with network and vendor teams, and contribute to root cause analyses. Drive cost optimization initiatives by analyzing usage patterns, managing reserved capacity, and enforcing efficient resource lifecycles.
Requirements
- Bachelor's degree in Computer Science or equivalent practical experience
- 5+ years of hands-on cloud infrastructure experience, with strong focus on AWS; exposure to Oracle Cloud, Azure, or GCP is beneficial
- Proficiency in Linux system administration, including networking, security, and troubleshooting on Ubuntu or RHEL
- Proven experience with infrastructure-as-code tools such as Terraform, CloudFormation, and Ansible
- Detailed knowledge of AWS services including VPC, Transit Gateway, Direct Connect, EC2, S3, Route 53, CloudTrail, Lambda, IAM, and Organizations
- Familiarity with CI/CD systems and Git-based workflows
- Strong scripting skills in Bash, Python, or similar languages
- Experience with observability platforms like CloudWatch, Prometheus, or Grafana
- Effective problem-solving abilities and strong communication skills for cross-regional collaboration
Preferred Qualifications
- Experience with Kubernetes, particularly EKS, and container orchestration
- Background in large-scale cloud networking, including multi-region designs and Transit Gateway peering
- Exposure to cloud cost management tools such as AWS Cost Explorer, CUR, or FinOps platforms
Benefits
We are committed to fostering an inclusive workplace. All candidates are considered without regard to race, color, religion, sex, national origin, age, disability, or genetics. We comply with the E-Verify program for U.S. positions and provide reasonable accommodations for individuals with disabilities as required by law. Learn more about our culture and values through public platforms like LinkedIn, Instagram, Facebook, Glassdoor, Twitter (X), and YouTube.