At Practicebetter, we are looking for a Site Reliability Engineer to join our mission. You will be a key part of transitioning from traditional Cloud Operations to an automation-driven SRE model, focusing on improving infrastructure, automating tasks, and ensuring reliability throughout the application lifecycle.
What You'll Do
- Refine monitoring and observability with tools like Prometheus, Grafana, and the ELK Stack.
- Automate deployments and workflows using IaC tools like Terraform and Ansible.
- Optimize CI/CD pipelines for fast, reliable releases and scalability.
- Manage and scale cloud-based systems on platforms like AWS, GCP, and Azure.
- Support incident response and lead post-mortem analysis for continuous improvement.
- Collaborate with cross-functional teams to integrate reliability practices into development.
- Drive technical innovation by introducing new tools and practices to improve reliability.
What We're Looking For
- Solid understanding of DevOps, Cloud Operations, or SRE principles with a focus on reliability and scalability.
- Hands-on experience with Linux systems, including performance tuning, kernel configurations, and troubleshooting.
- Proficiency in programming languages such as Go (preferred) or Python for building tools and automation.
- Strong skills in scripting languages like Python, Bash, or Go to automate workflows and manage infrastructure.
- Extensive experience with cloud platforms like AWS, GCP, and Azure.
- Expertise in monitoring/logging frameworks and CI/CD pipelines.
- Strong problem-solving skills, system design experience, and ability to collaborate across teams.
Nice to Have
- Hands-on experience with Docker, Kubernetes, and other containerization technologies.
Technical Stack
- Monitoring/Observability: Prometheus, Grafana, ELK Stack
- Infrastructure as Code: Terraform, Ansible
- Cloud Platforms: AWS, GCP, Azure
- Containerization: Docker, Kubernetes
- Operating System: Linux
- Languages: Go, Python, Bash
Team & Environment
You will be part of the Site Reliability Engineering team, reporting to the Director, Site Reliability Engineering.
Benefits & Compensation
- Flexible PTO
- Comprehensive healthcare coverage (UK, France, Spain)
- Company stock options
- Professional development budget
- Office equipment budget
- Wellness budget
- Annual team gatherings
- Internet reimbursement
- Inclusive parental leave
- Remote work travel program
Work Mode
This is a global remote position open to candidates in France, Germany, Spain, and the United Kingdom.
At Upsun, we celebrate diversity in all its forms and are committed to fostering an inclusive, equitable, and supportive workplace where everyone can thrive.





