Masabi is looking for a Site Reliability Engineer to join and be at the forefront of ensuring our platform's reliability, performance, and security. You'll play a pivotal role in scaling and modernising our platform while collaborating closely with architecture and product teams to enable reliable delivery across the business.
What You'll Do
- Drive automation to reduce operational overhead and human error.
- Build CI/CD pipelines, develop Infrastructure as Code (IaC) using tools like Terraform and CloudFormation, and design scalable systems to handle high traffic.
- Drive the effort to scale up new environments as we expand globally.
- Refine processes, tools, and workflows to enhance system reliability, scalability, and efficiency.
- Plan capacity to anticipate future needs and support high-performance systems.
- Ensure infrastructure meets organisational security standards and supports compliance frameworks like SOC 2 and PCI.
- Maintain real-time monitoring systems aligned with SLIs and SLOs, ensuring uptime and performance meet or exceed SLAs.
- Set up proactive alerting mechanisms to address issues before they escalate.
- Monitor and optimise cloud infrastructure costs through autoscaling, rightsizing, and architectural reviews.
- Implement failover strategies, disaster recovery plans, and redundancy to ensure system resilience.
- Respond to production incidents, minimise downtime, and restore availability.
- Perform root cause analysis, implement preventive measures, and contribute to post-incident reviews.
- Partner with developers to design reliable, maintainable systems.
- Coach teams on best practices for reliability, scalability, and observability.
- Maintain detailed documentation for infrastructure, incident response, and workflows.
- Develop playbooks and runbooks to ensure seamless knowledge transfer.
What We're Looking For
- Significant experience in SRE or related roles, with a proven track record in building and maintaining reliable systems.
- Expertise in AWS Cloud technologies.
- Hands-on experience with Terraform and Grafana, along with strong knowledge of security principles and networking.
- Experience in building pipelines and robust CI/CD infrastructure.
- A collaborative team player who approaches projects with an open mind and prioritises security.
- Passionate about leveraging technology to drive advancements while ensuring reliability and security.
- Excellent communication skills, a collaborative mindset, and a willingness to learn and contribute.
- Self-sufficient and capable of working independently, while also knowing when to seek support.
Nice to Have
- Familiarity with PCI DSS v4 Compliance requirements.
- AWS Cloud certification.
- Experience with orchestrating containers.
Technical Stack
- AWS, Gitlab, Terraform, CloudFormation, Puppet, Kibana, Grafana, Confluent Cloud, Prometheus, CloudWatch, Pingdom, GitLab CI, Rundeck
Benefits & Compensation
- Up to 26 days of holiday per year plus the Christmas Shutdown (another 3-4 days).
- Private healthcare.
- Monthly team bonding allowance.
- Up to €1000 training budget per year.
- €200 to spend on your home office.
- Choice of workstation.
- Menopause support.
- Ability to work for up to 3 months per year from any country in the world.
Work Mode
This is a local-country role based in Romania.
Masabi is an equal opportunity employer, driven by purpose to make journeys simple. We foster a culture of learning, not blame, and operate with openness and trust.





