About the Role
This role involves designing and maintaining scalable, resilient systems by combining software engineering with operational expertise to support high-availability services.
Responsibilities
- Design and manage highly available and fault-tolerant systems
- Implement automated monitoring and alerting solutions
- Optimize system performance and reliability
- Collaborate with development teams to improve code deployability
- Troubleshoot complex production issues across distributed systems
- Develop tools and scripts to streamline operations
- Maintain and enhance CI/CD pipelines
- Enforce security and compliance standards across infrastructure
- Drive incident response and post-mortem analysis
- Support cloud infrastructure at scale
- Apply infrastructure-as-code principles for consistent deployments
- Contribute to capacity planning and resource forecasting
- Ensure service level objectives and error budgets are met
- Promote observability through logging, metrics, and tracing
- Reduce technical debt in operational systems
- Evaluate and integrate new technologies for reliability gains
- Participate in on-call rotations with rapid response protocols
- Document system architecture and operational procedures
- Mentor engineers in best practices for system stability
- Improve disaster recovery and business continuity readiness
Nice to Have
- Master’s degree in a technical field
- Experience in high-scale SaaS environments
- Familiarity with microservices architecture
- Knowledge of service mesh technologies
- Background in security engineering
- Certifications in cloud or systems platforms
- Public speaking or open-source contributions
- Experience with large-scale data pipelines
Compensation
Competitive salary and benefits package
Work Arrangement
Remote with flexibility for global collaboration
Team
Collaborative engineering team focused on cloud systems and service reliability
Why This Role Matters
This position plays a critical role in maintaining the backbone of a growing cloud platform, ensuring services remain stable and responsive under increasing demand.
Growth Opportunities
Engineers in this role have clear paths to technical leadership, architecture design, and cross-team influence as the organization scales.
Available for qualified candidates