Dubai, Dubai, United Arab Emirates Hybrid

Sana Commerce is hiring a Team Lead Site Reliability Engineer

Build and manage a global Site Reliability Engineering team focused on monitoring, maintaining, and enhancing the reliability of cloud infrastructure. Lead incident response, promote automation, and integrate operational best practices across development teams to ensure high system availability and performance.

Responsibilities

  • Lead the SRE team by defining goals and guiding efforts to achieve strong system reliability while managing cost and performance commitments.
  • Work closely with platform and product engineering teams to integrate reliability and operational standards into the development lifecycle.
  • Establish and implement SRE frameworks, including service level objectives, service level indicators, and error budgeting.
  • Promote automation across operations to minimize manual tasks, improve system efficiency, and support scalable growth.
  • Manage incident response processes, conduct post-mortem reviews, and lead root cause analysis to prevent recurring issues.
  • Lead capacity planning and scalability initiatives to support business growth and optimize resource usage.
  • Oversee disaster recovery planning and testing to ensure uninterrupted service for customer webstores.
  • Foster a culture of continuous learning by mentoring team members and encouraging innovation.
  • Stay current with advancements in SRE practices and advocate for the adoption of relevant technologies and methodologies.

Requirements

  • Bachelor’s or Master’s degree in Computer Science, Engineering, or a related technical field.
  • Minimum of 5 years of experience in Site Reliability Engineering, including at least 2 years in a leadership capacity.
  • Extensive hands-on experience with Microsoft Azure, including deployment and management of cloud-native systems, with required knowledge of Kubernetes.
  • Solid understanding of network protocols, load balancing, and high availability setups.
  • Experience applying software engineering to SRE challenges, with proficiency in languages such as PowerShell, C#, Python, Go, or Java.
  • Proven experience with automation tools and infrastructure-as-code platforms like Terraform and Ansible.
  • Skilled in using monitoring and logging tools such as Prometheus, Grafana, and the ELK Stack to build comprehensive observability solutions.
  • Demonstrated ability to solve complex technical problems under pressure.
  • Proven leadership experience, including mentoring and growing high-performing engineering teams.
  • Strong communication and collaboration skills with a history of effective cross-team coordination.

Nice to Have

  • Familiarity with Dynatrace is advantageous.

Tech Stack

Microsoft Azure, Kubernetes, Terraform, Ansible, Prometheus, Grafana, ELK Stack, Dynatrace, PowerShell, C#, Python, Go, Java

Benefits

  • Opportunity to drive impact within a rapidly growing SaaS scale-up.
  • Eligibility for up to 5 weeks of 'work from anywhere' annually.
  • Customized global onboarding program, highly rated by new hires.
  • Hybrid work model with 3 days in office and 2 days remote per week.
  • Weekly company-sponsored lunch.

Work Arrangement

hybrid — 3 days from the office, 2 days from home; up to 5 weeks “work from anywhere” per year

Team

global SRE team managing and monitoring all systems, environments, and infrastructure

  • We deliver lasting success by balancing immediate results with long-term value.
  • We empower customers by transforming B2B commerce and enabling their leadership.
  • We embrace challenges and continuously raise the bar for ourselves and the industry.
  • We act boldly, supported by trust and mutual accountability within the team.

Additional Information

  • Even if you don’t meet all listed qualifications, we encourage applications from candidates who align with our vision and are eager to grow with us.
  • This role includes a hybrid work model with remote flexibility and a 'work from anywhere' benefit.
Required Skills
Microsoft AzureKubernetesTerraformAnsiblePrometheusGrafanaELK StackPowerShellC#PythonGoJava
About company
Sana Commerce
Sana Commerce is an e-commerce platform designed to help manufacturers, distributors and wholesalers succeed by fostering lasting relationships with customers. Founded in 2007, they are a fast-growing SaaS company that allows employees to take ownership of their careers.
All jobs at Sana Commerce Visit website
Job Details
Department Engineering
Category infrastructure
Posted 3 months ago