Responsibilities
- Blaze a Trail: Own and develop our platform infrastructure strategy, with the sponsorship and responsibility to match. Map out what we need, make the calls, and own the outcomes.
- Be an Owner: Be directly involved in deciding what we work on and how we work on it. Make promises, and keep them.
- Do Sensible Things: Make principled build vs. buy assessments and advocate for the right tools for the right job — not the fashionable ones, not the ones already in the estate just because they’re there.
- Garage Door Open: Create and maintain comprehensive internal documentation and decision records for systems and processes. Participate in architectural forums and make principled, open decisions that the rest of the organisation can learn from and hold us to.
Requirements
- Distributed systems depth, grounded in practice. You have a solid working model of how production systems fail — consistency and availability tradeoffs, failure cascades, backpressure, graceful degradation. You can draw the diagram, explain the failure modes at each node, and make a reasoned argument for which ones actually matter in a given context. NALSD thinking is how you naturally approach a new system design.
- Kubernetes at operator depth. You know what happens inside the scheduler and the control loop when things go wrong, because you’ve been there. You’ve operated clusters under real load, not just deployed workloads onto them.
- Strong Go proficiency. The platform team writes production Go. You should be fluent: you’ve built and shipped systems in it, and you have opinions about what good Go looks like.
- Multi-cloud experience, not just multi-cloud exposure. You’ve made considered architectural decisions across AWS, GCP, and/or Azure — not just consumed managed services, but evaluated tradeoffs between them and lived with those decisions in production.
- Experience defining requirements and driving technology choices across an engineering organisation. You’ve been the person in the room who frames the decision correctly, not just the one who executes it.
- Strong written and verbal communication. You can write a design doc that changes minds, and a postmortem that makes the organisation smarter. You’ve worked effectively in a globally-distributed team.
Nice to Have
- Experience with storage primitives at the system level — you’ve reasoned about when to reach for a relational store vs. an object store vs. something else, and you have real opinions informed by real failures.
- Experience working on a SaaS/PaaS product across multiple cloud providers.
- Familiarity with Apache Airflow or workflow orchestration systems.
Work Arrangement
Hybrid
Additional Information
- At Astronomer, we value diversity. We are an equal opportunity employer: we do not discriminate on the basis of race, religion, color, national origin, gender, sexual orientation, age, marital status, veteran status, or disability status.