What You'll Do
Shape the foundation of a platform serving over half a million daily users. You'll enhance system reliability and scalability, ensuring zero-downtime performance as the product grows. Your work will directly influence how quickly and safely engineering teams ship features.
Build and refine observability practices using metrics, distributed tracing, logs, and SLOs. Develop alerting systems that reduce noise and improve incident response, making on-call more sustainable and effective.
Drive architectural improvements that balance speed, cost, and resilience. Identify inefficiencies in cloud spending and turn infrastructure optimization into a measurable advantage for the business.
Design developer experiences that prioritize speed and safety. Streamline CI/CD pipelines, environment provisioning, and automation to create clear, guided paths for engineering teams—making best practices the default choice.
Strengthen platform-wide security by implementing least-privilege access, secrets management, and system hardening—without slowing down development velocity.
Lead cross-functional initiatives, define roadmap priorities, and mentor engineers across teams. Establish standards and shared practices that scale with the organization, reducing tribal knowledge and increasing system ownership.
Requirements
- Proven experience maintaining and evolving large-scale production systems with a focus on reliability, observability, and incident management
- Track record of improving developer velocity through better tooling, automation, and CI/CD workflows
- Strong software engineering skills—able to write clean, maintainable, and well-tested code as part of platform development
- Experience making practical architectural trade-offs between performance, cost, and reliability
- Firm understanding of infrastructure security fundamentals: IAM, secrets management, network hardening, and monitoring
- Hands-on mindset—you’re ready to contribute directly to code when needed
- Fluent in English, both written and spoken
Preferred Qualifications
- Experience with AWS, particularly in serverless environments
- Familiarity with Infrastructure as Code (IaC) and modern deployment strategies
- Product-oriented thinking—awareness of user impact, performance, and real-world usage patterns
- Ability to navigate and improve complex, evolving codebases incrementally
- Strong communication skills, with experience mentoring others through code reviews and collaboration
Benefits
- Fully flexible remote work—choose your location with the ability to work from anywhere in the world up to 3 months per year
- Access to coworking spaces and a dedicated budget for remote employees
- Competitive compensation and equity for all employees
- Weekly half-day dedicated to skill development using AI tools and experimentation
- Annual offsite in inspiring locations
- Team event budget and monthly local gatherings
- Support for wellness through ClassPass contributions
- Generous parental leave: 8 weeks at full pay for second parents
