This role focuses on enhancing system reliability, performance, and operational efficiency by supporting internal teams and the Payment Platform. The engineer will engage in incident management, automation, and close collaboration with development teams to maintain scalable and secure systems.

Responsibilities

Analyze code, networking, operating systems, or storage layers to resolve complex technical challenges
Build automated solutions and use monitoring tools to maintain system stability and uptime
Respond to system incidents and assist in root cause analysis and resolution
Engage in on-call rotations and escalation procedures as part of a 24x7 reliability model
Detect and resolve performance issues through code improvements, configuration adjustments, or infrastructure upgrades
Work closely with development teams to align software design with operational needs
Refine operational processes to enhance system efficiency and reliability
Maintain awareness of emerging technologies and industry advancements
Develop and debug applications while supporting CI/CD pipelines, infrastructure automation, and scalability requirements

Requirements

Proven experience in a Site Reliability Engineering or similar operational development role
Programming background with proficiency in at least one of: C#, Java, GoLang, or Python
Hands-on experience with cloud platforms such as AWS, Azure, or GCP
Ability to perform effectively in a high-velocity environment combining development and operations
Strong interpersonal and teamwork skills for cross-functional collaboration
Familiarity with monitoring and observability tools including Grafana and Splunk
Operational knowledge of both relational and NoSQL databases
Experience using containerization tools like Docker and orchestration systems such as Kubernetes
Bachelor’s degree in Computer Science or a related technical field, or equivalent professional experience

Nice to Have

Experience implementing infrastructure as code, particularly with Terraform
Understanding of RESTful API design and development
Knowledge of Agile development principles and practices
Experience working with GitOps workflows
Exposure to Apache Kafka and event-driven architectures

Tech Stack

C#, Java, GoLang, Python, AWS, Azure, GCP, Grafana, Splunk, RDBMS, NoSQL, Docker, Kubernetes, Terraform, RESTful APIs, Apache Kafka

Team

Part of the Site Reliability Engineering team, providing operational support to internal stakeholders and Payment Platform engineering groups.

Additional Information

Participation in 24x7 on-call rotations and escalation processes is required.

WEX Inc is hiring a Site Reliability Engineer

Responsibilities

Requirements

Nice to Have

Tech Stack

Team

Additional Information

Similar Jobs

Senior Cloud Engineer, Runtime Platform Team (K8s & Kafka)

Senior Infrastructure Engineer /DevOps

Implementation Engineer

DevOPS Engineer

Senior Site Reliability Engineer - Ireland

Senior/Lead Cloud Automation Developer

Related Articles

Platform Engineering: Kubernetes for All

Become an AI Developer: Your Career Guide

Oracle Cloud AI Shift 2026: Key Skills for the Transition