Scalingo is hiring a Senior Site Reliability Engineer (SRE) to be responsible for the reliability, performance, and resilience of our European cloud platform. You will provide technical leadership, manage incidents, and drive strategic automation projects, with a future path toward team management.
What You'll Do
- Provide technical leadership and guidance to the SRE team, including mentoring, prioritization, and technical reviews.
- Analyze system performance, identify bottlenecks, and propose improvements for resource optimization and scalability.
- Define, implement, and improve observability tools for proactive incident detection.
- Participate in on-call rotations and manage critical incidents to limit impact.
- Lead and animate incident post-mortems, identifying root causes and defining corrective actions.
- Ensure compliance with service commitments and contribute to ISO 27001 and HDS compliance.
- Plan, execute, and analyze regular tests of business continuity and disaster recovery plans.
- Collaborate closely with development teams to integrate reliability, performance, and security requirements from the design phase.
- Contribute to writing, structuring, and maintaining clear and up-to-date operational documentation.
What We're Looking For
- Solid expertise in cloud environments and distributed infrastructures, with a strong culture of high availability and production reliability.
- Mastery of observability practices and structured diagnostic skills for complex incidents.
- Good understanding of containerized environments and their operational challenges.
- Confirmed skills in production databases: reliability, backups, restoration, replication, and scalability.
- Practice of Infrastructure as Code and environment automation.
- Sensitivity to operational security issues.
- Comfort using AI tools to improve daily efficiency.
- Ability to work in complex, changing, or uncertain contexts with rigor and reliability.
- Clear and structured communication, taste for cross-team collaboration and knowledge sharing.
- Blameless posture, technical curiosity, composure, and attention to user impact.
- Ability to exercise technical leadership, transmit knowledge, and advance collective practices.
Team & Environment
You will join an SRE team of 2 people and report directly to an Engineering Manager. The role involves strong technical and operational leadership without direct hierarchical responsibility initially.
Benefits & Compensation
- Full remote with 1 trip per quarter (Strasbourg or other city)
- Company events: 1 annual Offsite and regular afterworks
- Remote work allowance (57.60€)
- Restaurant vouchers (11.52 € per unit) and Swile card with benefits
- Health insurance fully covered by Scalingo (BENEFIZ)
- Flexible hours under a time-based agreement (RTT)
- Laptop running Linux
- Budget for complementary equipment (participation)
Work Mode
This is a fully remote position.
We firmly believe in equal opportunities.




