We are looking for a Site Reliability Engineer to strengthen our infrastructure as we grow. The right candidate will have a proven track record in scaling production systems and a strong drive to build resilient, observable platforms that support real-time messaging at scale.

Responsibilities

Design and maintain infrastructure capable of processing billions of messages and real-time events daily
Automate deployment workflows, monitoring alerts, and incident response procedures to reduce manual intervention
Improve on-call effectiveness through actionable alerts, comprehensive runbooks, and faster troubleshooting
Optimize MySQL and other data storage systems for performance, scalability, and fault tolerance
Work collaboratively across engineering teams to debug, deploy, and support production systems
Use AI-powered tools to accelerate prototyping, enhance decision-making, and streamline engineering workflows
Share technical insights through internal mentorship, documentation, and public-facing content such as short videos and articles

Requirements

Minimum of 7 years of experience in site reliability, infrastructure engineering, or similar roles focused on large-scale production systems
Extensive experience with MySQL, including schema design, query optimization, and operational tooling
Strong proficiency with cloud-native technologies, particularly Google Cloud Platform, and infrastructure-as-code using Terraform
Skilled in Go and Bash for systems programming, automation, and scripting tasks
Proven ability to diagnose, monitor, and resolve issues in distributed systems using robust observability practices
A pragmatic approach to problem-solving, prioritizing progress and ownership over theoretical perfection

Tech Stack

Go, React, Ember, Artificial Intelligence tools, Google Cloud Platform (GCP), Terraform, MySQL, Bash

Benefits

Competitive salary range of $140,000 - $180,000 USD, adjusted for experience and local market rates
Full coverage of medical, dental, vision, mental health, and supplemental insurance for employees and their families
16 weeks of paid leave for new parents
Unlimited paid time off to support work-life balance
Financial stipends for remote work setup and wellness activities
Dedicated budget for professional growth and development opportunities
Comprehensive, inclusive benefits designed to support personal and professional well-being

Compensation

Starting salary for this role is $140,000 - $180,000 USD (or equivalent in local currency) depending on experience and subject to market rate adjustment

Work Arrangement

Remote-friendly with global reach; no location restrictions

End-to-end ownership of technical challenges with a bias for action and adaptability in uncertain environments
Engineering with a focus on user experience—considering performance, reliability, and customer impact in every decision
Challenging established norms with creativity and rigor, valuing progress and innovation over rigid adherence to tradition

Additional Information

Final candidates will undergo a background check and employment verification as part of the hiring process
Interviews will be conducted virtually using Zoom
Formal job offers will be delivered in writing on official letterhead
The hiring process includes a take-home assignment and a technical interview focused on system design
Applicants are encouraged to report any suspicious activity and direct questions to jobs@customer.io

Customer.io is hiring a Site Reliability Engineer

Responsibilities

Requirements

Tech Stack

Benefits

Compensation

Work Arrangement

Additional Information

Similar Jobs

Senior Site Reliability Engineer - Ireland

DevOps Engineer - Vice President

MySQL Database Engineer

IT Operations Automation Engineer (100% Remote - Canada)

Sr Cloud Engineer | NodeJS + TS/JS | Europe remote

IT Operations Automation Engineer (100% Remote - Ireland)

Related Articles

AI Talent Gap UK: Visa Decline Hits Tech Growth

Platform Engineering: Kubernetes for All

Kubernetes Remote Jobs: AI & Cloud-Native Careers in 2026