Responsibilities
- Participate in an on-call rotation using PagerDuty to address service disruptions affecting system availability
- Develop and maintain infrastructure using Ansible, Terraform, and Kubernetes to support massive concurrent user loads
- Create robust monitoring solutions to maintain optimal service quality for end users
- Design and execute operational workflows including software deployments and system upgrades
- Troubleshoot live production issues across distributed services and multiple layers of the technology stack
- Evaluate and recommend enhancements to system architecture focused on reliability, speed, and uptime
- Forecast and manage the expansion of infrastructure to meet growing business demands
Work Arrangement
Remote (City/Region)
Equal Opportunity Employer
This employer provides equal employment opportunities without regard to race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity, veteran status, or other protected characteristics.
Other
On-call rotation using PagerDuty to respond to incidents that impact availability.