You will be part of the Database Excellence team, ensuring the PostgreSQL systems behind a major open source SaaS platform remain reliable and scalable under heavy load. Your work will involve automating operations, improving system performance, and collaborating across teams to influence product development and infrastructure design.
Responsibilities
- Automate routine operational tasks across environments, minimizing manual intervention through tools and workflows for configuration, provisioning, and service management.
- Design and maintain scalable PostgreSQL infrastructure to support hundreds of thousands of concurrent users with high availability and performance.
- Respond to critical production incidents, working closely with other SREs to diagnose, resolve, and prevent database-related outages.
- Build and maintain observability systems that track database health, forecast capacity needs, and trigger alerts based on symptoms before outages occur.
- Collaborate with engineering and product teams to deliver performance improvements, including query optimization, migration reviews, and infrastructure guidance.
- Develop self-service automation and tooling using Terraform, Ansible, Chef, and ChatOps to enable safe, independent database interactions by engineering teams.
- Document operational knowledge, post-incident reviews, and system decisions to enable repeatability and inform future automation efforts.
- Participate in on-call rotations to provide coverage for system reliability during nights, weekends, and holidays as needed.
Requirements
- Proven experience managing PostgreSQL in large-scale, high-growth production environments, including both self-hosted and managed database platforms.
- Strong proficiency with automation and configuration management tools such as Ansible, Terraform, Chef, or Puppet to ensure consistent and reliable systems.
- Solid knowledge of SQL, PL/pgSQL, data modeling, and database internals, with the ability to troubleshoot and optimize PostgreSQL performance.
- Experience operating in large, distributed SaaS environments where scalability, reliability, and performance are critical at high volume.
- Excellent written communication skills with a commitment to thorough documentation in a remote, asynchronous work culture.
- Self-driven mindset with a history of identifying problems, owning solutions, and driving improvements in systems and code.
- Ability to mentor junior engineers, deepen technical expertise, and share knowledge to elevate team capabilities.
Nice to Have
- Experience in backend development using languages such as Ruby or Go, or familiarity with OLAP systems like Clickhouse.
- Knowledge of Kubernetes and database operators for managing stateful services in containerized environments.
Tech Stack
PostgreSQL, Terraform, Ansible, Chef, Puppet, GitLab ChatOps, Kubernetes, Clickhouse, Ruby, Go, SQL, PL/pgSQL
Benefits
- Comprehensive benefits supporting health, financial security, and personal well-being
- Flexible Paid Time Off policy
- Employee resource groups fostering inclusion and community
- Equity compensation and Employee Stock Purchase Plan
- Funding for growth, learning, and professional development
- Parental leave support
- Financial assistance for home office setup
Compensation
The base salary range for this role’s level is $124,300 - $266,400 USD for US residents. Final salary is determined by experience, skills, location, and internal equity. The range does not include bonuses, equity, or benefits. Equity details are available at the company handbook. Sales roles may receive incentive pay up to 100% of base salary.
Work Arrangement
Fully remote role. The company hires globally, though some positions may have location-specific requirements.
Team
Part of the Database Excellence team, responsible for the reliability, scalability, performance, and security of the platform's database infrastructure. The team operates in a fully distributed, asynchronous model across multiple regions.
- AI is integrated into daily workflows as a key driver of efficiency, innovation, and impact.
- Performance-driven culture rooted in core values and continuous knowledge sharing.
- Inclusive environment where all perspectives are valued.
- Opportunities for rapid career growth and innovation.
- Fully remote and asynchronous collaboration model.
- Global co-creation of the company's future through distributed teamwork.
Additional Information
- Hiring is open to candidates worldwide. While all roles are remote, some may have location-based eligibility restrictions.
Not specified


