Customer.io is looking for a Site Reliability Engineer to help scale our infrastructure, reduce operational toil, and increase reliability as the company grows. You will work on high-scale systems and make our platforms better for both developers and customers.
What You'll Do
- Build and scale infrastructure to support billions of messages per day and real-time events
- Automate deployments, alerting, and incident response
- Make on-call better - clear alerts, solid documentation, and faster resolution
- Tune MySQL and other datastore performance and improve reliability across distributed systems
- Collaborate across teams to debug, ship, and support systems in production
- Share knowledge and raise the bar through sharing progress publicly with short videos, thoughtful writing, and mentorship
- Leverage AI tools to prototype, move faster, and make better decisions
What We're Looking For
- 7+ years in SRE or infrastructure roles, improving production systems at scale
- Deep MySQL experience - schema design, performance tuning, and operational tooling
- Fluency in cloud-native tech (GCP a plus) and Terraform
- Proficiency in Go and Bash for scripting and systems programming
- Skill in observability, incident response, and debugging distributed systems
- A preference for action over perfection, and pride in owning technical decisions
Technical Stack
- Go, React, Ember, MySQL, GCP, Terraform
Benefits & Compensation
- $140,000 - $180,000 USD (or equivalent in local currency) depending on experience and subject to market rate adjustment
- 100% coverage of medical, dental, vision, mental health, and supplemental insurance premiums for you and your family
- 16 weeks paid parental leave
- Unlimited PTO
- Stipends for remote work and wellness
- Professional development budget
Customer.io recognizes the stifling impact of systemic injustice on diverse communities. We commit to using our influence to increase inclusion and equity within the tech industry.






