The role involves scaling infrastructure to support high-volume messaging and real-time event processing. Emphasis is placed on automation, system reliability, and enhancing developer workflows through better tooling and observability.

Responsibilities

Design and maintain scalable infrastructure that handles billions of messages and real-time events daily
Automate deployment pipelines, alerting systems, and incident response workflows
Improve on-call experiences with actionable alerts, thorough documentation, and faster troubleshooting
Optimize performance and reliability of MySQL and other distributed data systems
Work cross-functionally to debug, deploy, and support production systems
Share technical progress through internal mentorship, writing, and short-form content
Use AI tools to accelerate prototyping, decision-making, and engineering velocity

Requirements

7+ years of experience in site reliability or infrastructure engineering with a focus on large-scale production systems
Extensive experience with MySQL including schema design, query optimization, and operational tooling
Strong proficiency with cloud-native technologies, including GCP and infrastructure-as-code using Terraform
Programming experience in Go and Bash for scripting and systems automation
Proven ability in observability, incident management, and debugging complex distributed systems
Bias toward action and ownership of technical decisions, even in uncertain conditions

Tech Stack

Go, React, Ember, AI, GCP, Terraform, MySQL, Bash

Benefits

Full coverage of medical, dental, vision, mental health, and supplemental insurance for employees and their families
16 weeks of paid parental leave
Unlimited paid time off
Stipends provided for remote work setup and wellness activities
Annual professional development budget
Comprehensive benefits designed to support well-being and personal growth

Compensation

Starting salary range is $140,000 - $180,000 USD (or equivalent in local currency), adjusted based on experience and market rates

Work Arrangement

global

End-to-end ownership of problems with a bias for speed and action, even in ambiguous environments
Engineers who prioritize user experience and consider performance and reliability in customer-facing systems
A culture that questions established norms and values progress over rigid adherence to tradition

Additional Information

Final candidates are required to complete a background check and employment verification
All virtual interviews are conducted via Zoom video calls, not chat-based platforms
Job offers are formally issued in writing on official letterhead
The company is committed to addressing systemic injustice and advancing inclusion and equity in tech
Efforts include inclusive hiring practices, bias reduction, and community partnerships to broaden impact

Customer.io is hiring a Site Reliability Engineer

Responsibilities

Requirements

Tech Stack

Benefits

Compensation

Work Arrangement

Additional Information

Similar Jobs

Senior DevOps / Infrastructure Engineer

Platform Engineer

DevOps Engineer - Vice President

Sr Cloud Engineer | NodeJS + TS/JS | Europe remote

Senior Site Reliability Engineer

Senior Infrastructure Engineer

Related Articles

Oracle Cloud AI Shift 2026: Key Skills for the Transition

remote full stack jobs 2026: Top Skills to Land a Role

Tech Layoffs AI Efficiency: Block Cuts 40% Workforce