We are looking for a Site Reliability Engineer to strengthen our infrastructure as we grow. The right candidate will have a proven track record in scaling production systems and a strong drive to build resilient, observable platforms that support real-time messaging at scale.
Responsibilities
- Design and maintain infrastructure capable of processing billions of messages and real-time events daily
- Automate deployment workflows, monitoring alerts, and incident response procedures to reduce manual intervention
- Improve on-call effectiveness through actionable alerts, comprehensive runbooks, and faster troubleshooting
- Optimize MySQL and other data storage systems for performance, scalability, and fault tolerance
- Work collaboratively across engineering teams to debug, deploy, and support production systems
- Use AI-powered tools to accelerate prototyping, enhance decision-making, and streamline engineering workflows
- Share technical insights through internal mentorship, documentation, and public-facing content such as short videos and articles
Requirements
- Minimum of 7 years of experience in site reliability, infrastructure engineering, or similar roles focused on large-scale production systems
- Extensive experience with MySQL, including schema design, query optimization, and operational tooling
- Strong proficiency with cloud-native technologies, particularly Google Cloud Platform, and infrastructure-as-code using Terraform
- Skilled in Go and Bash for systems programming, automation, and scripting tasks
- Proven ability to diagnose, monitor, and resolve issues in distributed systems using robust observability practices
- A pragmatic approach to problem-solving, prioritizing progress and ownership over theoretical perfection
Tech Stack
Go, React, Ember, Artificial Intelligence tools, Google Cloud Platform (GCP), Terraform, MySQL, Bash
Benefits
- Competitive salary range of $140,000 - $180,000 USD, adjusted for experience and local market rates
- Full coverage of medical, dental, vision, mental health, and supplemental insurance for employees and their families
- 16 weeks of paid leave for new parents
- Unlimited paid time off to support work-life balance
- Financial stipends for remote work setup and wellness activities
- Dedicated budget for professional growth and development opportunities
- Comprehensive, inclusive benefits designed to support personal and professional well-being
Compensation
Starting salary for this role is $140,000 - $180,000 USD (or equivalent in local currency) depending on experience and subject to market rate adjustment
Work Arrangement
Remote-friendly with global reach; no location restrictions
- End-to-end ownership of technical challenges with a bias for action and adaptability in uncertain environments
- Engineering with a focus on user experience—considering performance, reliability, and customer impact in every decision
- Challenging established norms with creativity and rigor, valuing progress and innovation over rigid adherence to tradition
Additional Information
- Final candidates will undergo a background check and employment verification as part of the hiring process
- Interviews will be conducted virtually using Zoom
- Formal job offers will be delivered in writing on official letterhead
- The hiring process includes a take-home assignment and a technical interview focused on system design
- Applicants are encouraged to report any suspicious activity and direct questions to jobs@customer.io


