Responsibilities
- Take end-to-end ownership of Amazon EKS infrastructure, specifically managing the creation, scaling, and seamless version upgrading of clusters.
- Design, implement, and optimize robust CI/CD pipelines using tools like Jenkins, GitHub Actions, or GitLab CI to ensure rapid, secure software delivery.
- Architect, provision, and maintain cloud infrastructure utilizing IaC principles with Terraform.
- Automate server provisioning and application deployment processes, primarily using Ansible.
- Design and explain AWS architectures while identifying and deploying cybersecurity measures through continuous vulnerability assessments.
- Drive engineering teams toward SecDevOps best practices and create monitoring dashboards by integrating API statuses and application logs.
- Perform root cause analysis and lead incident management to facilitate rapid iteration and massive growth.
- Partner with SREs, L2/Support, and developers to deploy and scale new product features and improve production monitoring in a large-scale Linux environment.
Requirements
- 8+ years of total experience, with a strong focus specifically in DevOps and SRE.
- Proven hands-on experience in the creation and upgrading of Amazon EKS clusters.
- Deep understanding of continuous integration practices and CI/CD methodology for both containerized and non-containerized applications.
- Advanced proficiency in Infrastructure as Code using Terraform (and AWS CloudFormation).
- Proficiency in at least one programming language such as Python.
Nice to Have
- Experience with configuration management tools; Ansible is highly preferred (experience with Chef or Salt is also valued).
- Strong understanding of Linux systems administration, networking, and large cluster orchestration.
- Hands-on experience with AWS cloud services including IAM, VPC, EC2, S3, Lambda, ECS, and API Gateway.
- Excellent communication and organizational skills, with a proven ability to coordinate effectively within a team and with customers.
Benefits
- Impact at scale from day one. Your code processes billions of rows for companies like DoorDash and LinkedIn.
- The AI wave is real for us. We're not bolting AI onto a legacy product. Intelligent connectors, context-aware data movement, and agentic workflows are the core of what we're building next.
- Small team, big problems. You'll have direct access to the CTO, real influence over product direction, and the autonomy to make significant technical bets.
- Recognized platform with startup energy. The credibility of enterprise validation with the speed and ownership of an early-stage company.