Responsibilities
- Assess, choose, and manage the CI/CD platform used across the organization.
- Establish and monitor DORA metrics to enhance software delivery speed and reliability.
- Create temporary, on-demand preview environments for services that cannot be run locally.
- Enhance developer productivity by optimizing build times, local tools, and service templates.
- Manage the complete AWS infrastructure, including ECS/Fargate, Aurora PostgreSQL, MSK, and DynamoDB.
- Advance Terraform infrastructure-as-code practices with modularity, drift monitoring, and automated testing.
- Lead initiatives to reduce cloud spending and provide cost transparency per team.
- Oversee and refine the Datadog observability setup for monitoring and alerting.
- Develop automated tools and documented procedures to minimize manual operations and speed incident resolution.
- Lead post-incident reviews and implement changes to prevent future outages.
- Enforce security best practices including container scanning, secret handling, and least-privilege access controls.
- Assist with compliance efforts for SOC 2 audits as required.
Work Arrangement
Remote (Worldwide) — US, Canada
Other
- Remote-first setup with availability during US time zones
- Frequent asynchronous collaboration
- Optional in-person team gatherings
- Preference for asynchronous communication
- Reliance on clear written documentation for transparency
- Meetings reserved for collaboration, decisions, and team bonding
- Occasional offsites for strategy and celebration
- Flexible or unlimited paid time off
- Medical, dental, and vision benefits
- Parental leave policy
- 401(k) retirement plan
- Support for home office equipment
- Monthly stipend for remote work expenses
- Quarterly team offsites and annual company-wide gathering