Site Reliability Engineer at Capital Markets Gateway (Expired)

About the Role

The role involves bridging software engineering and systems operations to build and maintain resilient, low-latency financial platforms used in fast-paced market environments.

Responsibilities

Design and implement automated deployment pipelines for production systems
Monitor system health and proactively identify performance bottlenecks
Respond to and resolve critical production incidents with minimal downtime
Develop tools to improve operational efficiency and reduce manual intervention
Collaborate with development teams to enhance system reliability
Maintain and scale infrastructure supporting high-frequency trading platforms
Enforce observability standards using logging, metrics, and tracing
Participate in on-call rotations for rapid incident response
Optimize system performance under heavy transaction loads
Ensure configurations adhere to security and compliance requirements
Troubleshoot complex distributed system failures
Drive post-incident reviews to prevent recurrence
Implement disaster recovery and failover strategies
Support capacity planning for future growth
Integrate reliability best practices into the development lifecycle
Manage configuration consistency across environments
Automate routine operational tasks to increase team velocity
Contribute to system architecture discussions with engineering teams
Maintain documentation for operational procedures and system design
Evaluate new technologies for improving platform stability
Enforce SLA and SLO compliance across services
Work closely with security teams to address vulnerabilities
Improve deployment reliability through canary and blue-green strategies
Support audit readiness for regulatory requirements
Promote a culture of shared ownership for system reliability

Compensation

Competitive salary and benefits package

Work Arrangement

Hybrid work model with flexible remote options

Team

Collaborative engineering team focused on high-performance systems

Technology Stack

Primary languages include Go and Python
Infrastructure runs on AWS with Kubernetes orchestration
Monitoring stack includes Prometheus, Grafana, and ELK
CI/CD powered by Jenkins and GitLab CI
Configuration management via Terraform and Ansible

Performance Expectations

Maintain 99.99% uptime for core trading services
Respond to critical incidents within five minutes
Reduce mean time to resolution by 20% year over year
Achieve full automation of routine operational tasks
Ensure all services meet defined SLOs

Available for qualified candidates

Capital Markets Gateway was looking for a Site Reliability Engineer

About the Role

Responsibilities

Compensation

Work Arrangement

Team

Technology Stack

Performance Expectations

Similar Jobs

Platform Engineer (Reliability) - Unannounced Project

Associate System Engineer

DevOps Engineer

NICE CXOne Engineer

IMS Engineer

Data Center Operations, Network Technician Lead