About the Role
The role involves owning core infrastructure systems, collaborating with engineering teams to deploy reliable services, and improving operational workflows through automation and best practices.
Responsibilities
- Design and manage scalable cloud infrastructure platforms
- Implement and monitor system security controls
- Ensure high availability and performance of production environments
- Develop automation scripts to streamline deployment processes
- Collaborate with development teams on infrastructure needs
- Troubleshoot and resolve system-level issues
- Maintain documentation for infrastructure architecture
- Support incident response and on-call rotations
- Optimize resource utilization and cost efficiency
- Enforce compliance with security and operational policies
- Upgrade and patch infrastructure components
- Evaluate and integrate new technologies
- Participate in capacity planning activities
- Conduct system performance analysis and tuning
- Manage configuration management tools
- Deploy and maintain monitoring and alerting systems
- Support disaster recovery and backup strategies
- Work with containerization and orchestration platforms
- Ensure infrastructure as code practices are followed
- Collaborate on CI/CD pipeline improvements
- Lead root cause analysis for critical outages
- Promote reliability and observability standards
- Assist in migration projects between environments
- Provide technical guidance to junior team members
- Participate in architecture design reviews
Nice to Have
- Master’s degree in a technical discipline
- Certifications in cloud platforms or infrastructure technologies
- Experience with large-scale distributed systems
- Background in site reliability engineering
- Knowledge of service mesh architectures
- Experience with multi-region deployments
- Familiarity with zero-trust security models
- Contributions to open-source infrastructure projects
- Public speaking or conference presentation experience
- Leadership in technical project delivery
Compensation
Competitive salary based on experience and location
Work Arrangement
Hybrid work model with flexible scheduling
Team
Collaborative engineering team focused on infrastructure resilience and innovation
Technology Stack
- Uses AWS for cloud infrastructure, Terraform for provisioning, Kubernetes for orchestration, and Prometheus for monitoring
- Leverages GitLab CI for pipeline automation and integrates with centralized logging via Fluentd
Growth Opportunities
- Engineers are encouraged to lead initiatives, mentor peers, and contribute to architectural decisions
- Opportunities for advancement into technical leadership roles
Visa sponsorship available for qualified candidates