Twilio is hiring a Site Reliability Engineer to join the Data Infrastructure Platform team. In this role, you will design, build, and optimize the platform that supports our data-driven initiatives, collaborating cross-functionally to architect scalable solutions.
What You'll Do
- Design, build, and maintain infrastructure and scalable frameworks to support data ingestion, processing, and analysis.
- Collaborate with stakeholders, analysts, and product teams to understand business requirements and translate them into technical solutions.
- Architect and implement data streaming solutions using modern data technologies.
- Design and implement frameworks and solutions for performance, reliability, and cost-efficiency.
- Ensure data quality, integrity, and security throughout the data lifecycle.
- Stay current with emerging technologies and best practices in big data technologies.
- Mentor early in career engineers and contribute to a culture of continuous learning and improvement.
What We're Looking For
- Bachelor's or Master's degree in Computer Science, Engineering, or a related field.
- 8+ years of experience in Site Reliability Engineering, DevOps, or Software Engineering roles with a focus on infrastructure or backend systems.
- Strong production experience, including operational management, scaling, partitioning strategies, and tuning for performance and reliability.
- Hands-on experience with Kubernetes (preferably EKS), including deploying and managing stateful services and operators.
- Deep understanding of AWS cloud services (e.g., EC2, EBS, S3, IAM, MSK, CloudWatch, VPC, ALB/NLB).
- Proficiency in infrastructure-as-code tools, such as Terraform or CloudFormation.
- Expertise in observability tools (e.g., Prometheus, Grafana, OpenTelemetry, Datadog) to monitor distributed systems and set up alerting.
- Proficient in at least one programming language (e.g., Go, Python, Java, or similar) for building automation and tooling.
- Experience designing and implementing incident response processes, SLOs/SLIs, runbooks, and participating in on-call rotations.
- Strong understanding of distributed systems principles, including consensus, durability, throughput, and availability tradeoffs.
- Proven track record of driving reliability improvements in high-scale, data-intensive systems and collaborating with platform and data engineering teams.
- Excellent problem-solving and analytical skills.
- Strong verbal & written communication skills, with the ability to work effectively in a cross-functional team environment.
Nice to Have
- Experience with data technologies like Apache Kafka, AWS MSK, Flink, Clickhouse.
- Bias to action, ability to iterate and ship rapidly.
- Passion to build data products, prior projects in this area.
Technical Stack
- Data Technologies: Kafka, AWS MSK, Flink, Clickhouse, Hive, Hudi, Presto, Airflow, Lakeformation, Glue, Athena
- Infrastructure & Cloud: Kubernetes, AWS EKS, AWS (EC2, EBS, S3, IAM, CloudWatch, VPC, ALB/NLB), Terraform
- Observability: Prometheus, Grafana, OpenTelemetry, Datadog
- Languages: Go, Python, Java
Team & Environment
You will be joining the Data Platform team, working in a cross-functional environment focused on building and maintaining the core data infrastructure.
Benefits & Compensation
- Competitive pay
- Generous time off
- Ample parental and wellness leave
- Healthcare
- Retirement savings program
- Compensation: Colorado, Hawaii, Illinois, Maryland, Massachusetts, Minnesota, Vermont or Washington D.C.: $152,500 - $190,600. New York, New Jersey, Washington State, or California (outside SF Bay area): $161,500 - $201,800. San Francisco Bay area, California: $179,400 - $224,200. + equity: May be eligible to participate in Twilio’s equity plan.
Work Mode
This is a remote position open to candidates located in the United States.
Twilio is proud to be an equal opportunity employer. We do not discriminate based upon race, religion, color, national origin, sex, sexual orientation, gender identity, gender expression, age, status as a protected veteran, status as an individual with a disability, genetic information, political views or activity, or other applicable legally protected characteristics.





