About the Role

The role involves combining software engineering and systems operations to build and maintain reliable, scalable systems. Responsibilities include automation, incident response, monitoring, and improving system performance and uptime.

Responsibilities

Design and implement automated deployment pipelines
Monitor system performance and respond to alerts
Troubleshoot production issues across services and platforms
Develop tools to detect and resolve system failures
Maintain high availability and performance of critical systems
Participate in on-call incident response rotations
Optimize system reliability and operational efficiency
Collaborate with development teams on service design
Enforce best practices for logging and observability
Manage configuration and infrastructure as code
Support disaster recovery planning and execution
Conduct root cause analysis for major incidents
Improve monitoring coverage and alerting accuracy
Scale infrastructure to meet growing demand
Ensure compliance with security and operational standards
Document system architecture and operational procedures
Evaluate and integrate new technologies
Drive post-mortem reviews with action plans
Reduce technical debt in operational systems
Promote a culture of reliability across engineering

Nice to Have

Master’s degree in computer science or engineering
Certifications in cloud or DevOps platforms
Experience with large-scale data processing systems
Background in software development for operations tools
Knowledge of database administration and optimization
Familiarity with service mesh technologies
Experience in regulated industries such as healthcare or logistics

Compensation

Competitive salary and benefits package

Work Arrangement

Hybrid work model with flexibility for remote and on-site work

Team

Collaborative environment integrating engineering and operations teams

About Us

We are a global company focused on delivering supply chain management solutions that drive efficiency and visibility across complex operations. Our technology supports mission-critical systems for clients in healthcare and distribution sectors.

What We Offer

Opportunities for professional growth, a supportive team culture, flexible work options, investment in learning and development, and the chance to work on impactful, large-scale systems.

Visa sponsorship may be available for qualified candidates

Tecsys Inc. is hiring a Site Reliability Engineer

About the Role

Responsibilities

Nice to Have

Compensation

Work Arrangement

Team

About Us

What We Offer

Similar Jobs

Platform Engineer, Infrastructure

Cloud Systems Engineer

Software Engineer / DevOps

Containerization Cloud Consulting

Cloud Systems Engineer (Cleared)

Senior Infrastructure Engineer /DevOps (relocation)

Related Articles

Platform Engineering: Kubernetes for All

Become an AI Developer: Your Career Guide

Kubernetes Remote Jobs: AI & Cloud-Native Careers in 2026