Sinch is looking for a Senior Site Reliability Engineer to shape, scale, and optimize the critically important infrastructure underpinning Mailgun's services. You will work closely with product engineering teams to drive improvements, automate workflows, and ensure systems meet high reliability standards.
What You'll Do
- Collaborate with other teams to define and implement system requirements.
- Design, build, and maintain cloud-based microservices infrastructure.
- Automate routine operational tasks and remediation processes to improve efficiency and reliability.
- Proactively fix and resolve issues, collaborating with support and engineering teams, and using monitoring tools to ensure system health.
- Ensure that datastores operate efficiently and meet performance and availability goals.
- Contribute to the team's growth by mentoring junior engineers and sharing standard methodologies.
- Plan and execute strategies for scaling systems and infrastructure as needs grow.
What We're Looking For
- Strong background in infrastructure, operations, or software engineering with a focus on reliability.
- Extensive experience working with cloud platforms such as Google Cloud Platform (GCP) or Amazon Web Services (AWS).
- Proficiency in using configuration management tools like Terraform and Ansible.
- Hands-on experience with modern monitoring and observability tools such as Prometheus and Grafana.
- Proven experience with distributed databases (e.g., Cassandra, Elasticsearch) and maintaining their health at scale.
- Familiarity with distributed event stores and stream-processing platforms.
- Strong coding skills in at least one modern programming language (Python, Go, etc.).
- Expertise in running and maintaining production systems in a Linux environment and public cloud infrastructure.
- Demonstrated expertise in architecting solutions for complex technical challenges, and the ability to lead initiatives from conception through to execution.
- Strong interpersonal and communication skills, with a history of building effective relationships with cross-functional teams.
- Ability to mentor and guide junior engineers, fostering a collaborative and inclusive team culture.
Nice to Have
- Experience with container orchestration platforms.
- Expertise in CI/CD pipeline automation and infrastructure as code practices.
- Knowledge of network architecture and security best practices in cloud environments.
- Experience with containerization and microservices architectures.
- Advanced problem solving skills, particularly in highly sophisticated and distributed systems.
Technical Stack
- Google Cloud Platform (GCP), Amazon Web Services (AWS)
- Terraform, Ansible
- Prometheus, Grafana
- Cassandra, Elasticsearch
- Linux, Python, Go
Team & Environment
Part of the SRE team responsible for keeping systems fast, reliable, and secure.
Benefits & Compensation
- Compensation: $143,000.00 - $179,000.00
- Comprehensive market competitive medical, dental, and vision plans.
- Access to telehealth and supplemental plans.
- Free virtual counseling resources through global Employee Assistance Program.
- Roth and Pre-tax 401(k) options with employer match.
- Generous paid time off program.
- Paid parental leave and family planning support.
- Flexible remote work offerings.
- Paid time off to support volunteer programs.
Work Mode
This role supports a global work mode.
We’re proud to be an equal opportunity employer, and all qualified applicants will be considered to join our team regardless of race, color, religion, creed, gender, national origin, age, disability, veteran status, marital status, pregnancy, sex, gender expression or identity, sexual orientation, citizenship, or any other legally protected class.


