About the Role
The role involves designing, implementing, and maintaining infrastructure and CI/CD systems to ensure high availability, security, and performance across cloud environments.
Responsibilities
- Design and manage cloud infrastructure on AWS
- Implement and maintain CI/CD pipelines
- Automate deployment and operational processes
- Monitor system performance and troubleshoot issues
- Ensure infrastructure security and compliance
- Collaborate with development teams on deployment strategies
- Optimize cloud resource usage and costs
- Maintain configuration management tools
- Support incident response and resolution
- Enforce infrastructure as code practices
- Manage containerized environments using Kubernetes
- Integrate monitoring and alerting systems
- Perform regular system audits and updates
- Document architecture and operational procedures
- Lead improvements in system reliability and scalability
- Evaluate and adopt new DevOps tools and technologies
- Support disaster recovery planning
- Ensure consistent environments across development, staging, and production
- Participate in on-call rotations
- Promote best practices in automation and security
Nice to Have
- Experience with large-scale distributed systems
- Familiarity with service mesh technologies like Istio
- Certifications in AWS or Kubernetes
- Background in healthcare or emergency response systems
- Experience with observability platforms
- Knowledge of zero-downtime deployment strategies
- Experience mentoring junior engineers
- Contributions to open-source DevOps tools
Compensation
Competitive salary based on experience
Work Arrangement
Hybrid
Team
Collaborative engineering team focused on resilient, scalable systems
Why This Role Matters
This position plays a key role in ensuring the stability and scalability of mission-critical platforms used during emergency response and public health operations.
Technology Stack
AWS, Kubernetes, Terraform, Jenkins, Prometheus, Docker, Git, Python, Ansible
Available