The Site Reliability Engineer plays a critical role in ensuring the reliability, scalability, and performance of our production systems. This position bridges the gap between development and operations by applying engineering principles to infrastructure and operations problems. You will be responsible for building robust monitoring systems, automating operational tasks, managing incident response, and driving improvements in system design to prevent future outages. The ideal candidate thrives in a fast-paced environment, values automation over manual intervention, and is passionate about creating systems that are observable, resilient, and self-healing. You will work cross-functionally with engineering teams to embed reliability practices into the development lifecycle and ensure services meet strict uptime and performance standards.

Responsibilities

Work closely with engineering teams to develop and support the infrastructure powering their services.
Maintain high availability, security, and scalability of Kubernetes clusters running on AWS EKS.
Build and implement resilience strategies including multi-region deployment, backup systems, and disaster recovery processes.
Automate infrastructure provisioning and management using Terraform and Infrastructure-as-Code practices.
Enhance CI/CD workflows to enable faster, safer, and more reliable software deployments.
Use observability tools to monitor system performance, detect issues, and ensure service reliability.
Participate in on-call rotations and lead incident response efforts, prioritizing root-cause resolution and prevention.

Requirements

Minimum of five years of professional experience in site reliability engineering or software development.
Proficient in programming languages including Java, Python, Bash, and Go for automation and problem-solving.
Extensive hands-on experience with AWS services such as RDS, CloudFront, IAM, and VPCs, along with Terraform and Kubernetes.
Demonstrated ability to design and operate resilient systems that perform reliably under failure conditions.
Experience optimizing and managing CI/CD pipelines using tools like CircleCI or GitHub Actions.
Proven incident management skills with the ability to analyze root causes and remain effective under pressure.
Strong collaborator with clear communication skills and a commitment to continuous improvement.

Nice to Have

Bachelor’s degree in Computer Science, Software Engineering, or a related field.

Tech Stack

AWS, EKS, Kubernetes, Terraform, CI/CD, CircleCI, GitHub Actions, Java, Python, Bash, Go

Team

SRE team collaborating with product engineering teams

Driven by solving complex technical challenges
Prioritizes automation across systems and workflows
Committed to building resilient and dependable infrastructure
Values collaboration and ongoing improvement

Additional Information

Incident response and root cause analysis experience is required.
Strong focus on automation to minimize manual intervention and prevent errors.
Emphasis on implementing sustainable, long-term fixes during incident resolution.

Luma Financial Technologies is hiring a Site Reliability Engineer

Responsibilities

Requirements

Nice to Have

Tech Stack

Team

Additional Information

Similar Jobs

Senior DevOps Engineer (hiring in US/CAN & LATAM)

Senior DevOps Engineer

Senior Engineer - Cloud Platforms

Cloud Platform Engineer

Jr DevOps Engineer

DevOps Engineer

Related Articles

Network Configuration as Code: CI/CD for Automation | NVIDIA

Developer Experience Platform: Lessons from Europe

Kubernetes Remote Jobs: AI & Cloud-Native Careers in 2026

Luma Financial Technologies is hiring a Site Reliability Engineer

Responsibilities

Requirements

Nice to Have

Tech Stack

Team

Additional Information

Similar Jobs

Senior DevOps Engineer (hiring in US/CAN &amp; LATAM)

Senior DevOps Engineer

Senior Engineer - Cloud Platforms

Cloud Platform Engineer

Jr DevOps Engineer

DevOps Engineer

Related Articles

Network Configuration as Code: CI/CD for Automation | NVIDIA

Developer Experience Platform: Lessons from Europe

Kubernetes Remote Jobs: AI & Cloud-Native Careers in 2026

Senior DevOps Engineer (hiring in US/CAN & LATAM)