The Site Reliability Engineer will ensure the reliability, performance, security, and cost efficiency of a dual-platform infrastructure. This includes maintaining a custom multi-cloud PaaS and a large-scale AWS enterprise setup, with a balance of independent ownership and close collaboration within a small, remote-first team.

Responsibilities

Maintain system stability through patching, performance tuning, incident response, and ongoing infrastructure health checks
Lead long-term initiatives such as platform migrations, internal tool development, security enhancements, and monitoring improvements
Help shape the evolution of the infrastructure to keep it modern, secure, and easy for developers to use
Support internal technical teams by answering complex infrastructure and operations questions
Collaborate with external development teams to assist with their daily technical challenges

Requirements

Fluency in written and spoken English
Strong experience with Docker, AWS, and cloud-native technologies
Programming background, preferably in Python or TypeScript, though Go or Java experience is also acceptable
Proficiency with configuration management and Infrastructure as Code tools, especially Ansible and AWS CDK
Solid understanding of core systems including Linux, networking, TCP/IP, and load balancing
Proactive and dependable work ethic with the ability to operate independently
Comfort handling support responsibilities and communicating professionally with technical clients

Nice to Have

Practical experience administering and tuning Linux systems
Familiarity with Django or similar Python web frameworks
Hands-on use of AWS CDK with TypeScript
Operational experience with PostgreSQL, Redis, RabbitMQ, or Elasticsearch
Experience with any of the technologies in our stack

Tech Stack

Docker, Django, Python, TypeScript, Ansible, AWS, EC2, S3, RDS, OpenSearch, Datadog, Redis, Elasticsearch, Nessus, AWS CDK, GitHub Actions, Cloudflare, DynamoDB, API Gateway, Gatsby, Storyblok, Lambda

Compensation

Not specified

Work Arrangement

Remote-first with team members across Europe

Team

Team of 18 people, remote-first, small and focused with minimal hierarchy

Curiosity
Ownership
Clarity
Collaborative problem-solving
Engineer-led decision making
Flexibility and adaptability
Balancing quick fixes with deep refactors

Additional Information

Fluency in written and spoken English is required
The company runs two distinct development workflows: 2-week sprints with quarterly OKRs for Divio Cloud, and 3-week Scrum cycles for the Enterprise AWS project
Support and on-call duties rotate weekly among team members
The team emphasizes autonomy, responsibility, and open sharing of ideas

Not specified

Divio is hiring a Site Reliability Engineer

Responsibilities

Requirements

Nice to Have

Tech Stack

Compensation

Work Arrangement

Team

Additional Information

Similar Jobs

Technical Operations Engineer (Remote, GBR)

Middle DevOps Engineer

Senior DevOps Engineer

DevOps Technician

Support Engineer

Senior Infrastructure Engineer

Related Articles

Become an AI Developer: Your Career Guide

remote full stack jobs 2026: Top Skills to Land a Role