The Site Reliability Engineer plays a critical role in ensuring the reliability, performance, and scalability of our production systems. This position bridges the gap between development and operations by applying engineering principles to operations challenges. The engineer will be responsible for monitoring system health, responding to incidents, automating operational tasks, and driving improvements in system resilience. Working closely with development and infrastructure teams, the role emphasizes proactive problem-solving, continuous improvement, and the implementation of best practices in deployment, monitoring, and incident management. The ideal candidate thrives in a fast-paced, dynamic environment and is passionate about building robust, scalable systems that support mission-critical applications.
Responsibilities
- Design and execute complex changes to production systems to improve stability and performance
- Collaborate on initiatives to enhance automation across platform operations
- Help define and uphold standards for software deployment and operating environments
- Diagnose, resolve, and conduct root cause analysis for production incidents
- Develop scripts and tools to increase system reliability and operational efficiency
- Partner with infrastructure and engineering teams to strengthen platform capabilities
Requirements
- Proficiency in Linux system administration from day one
- Hands-on experience managing Kubernetes environments
- Familiarity with scripting languages used in automation workflows
- Proven experience supporting live production services
- Background in on-call incident response and management
Nice to Have
- Educational background in computing or a related science field is beneficial
- Experience using configuration management tools
- Familiarity with public cloud platforms, particularly Azure
Tech Stack
Linux, Kubernetes, Scripting languages for automation, Azure, AWS, GCP, Apache Pulsar, Apache Flink, Configuration management tools, Infrastructure as code, Observability and monitoring tools
Benefits
- Supportive and inclusive work environment that fosters growth and recognition
- Strong commitment to diversity, equity, inclusion, and belonging
- Access to professional development opportunities
- Culture that celebrates achievements of all sizes
- Workplace that values authentic self-expression and diverse perspectives
Compensation
Not specified
Work Arrangement
Not specified
Team
Cross-functional collaboration with service teams, engineering teams, and infrastructure teams
- Customer success is a top priority
- Driven by strategy and core values
- Encourages leadership at all levels
- Focused on achieving ambitious goals
- Promotes innovation and disciplined execution
- Values diversity of thought and inclusive collaboration
- Celebrates progress and accomplishments regularly
- Fosters a sense of belonging and inclusion
Additional Information
- The company embraces diversity in thought, background, and identity
- Accommodations are available for candidates with disabilities upon request
- Job seekers should be cautious of fraudulent offers; legitimate offers require in-person or video interviews
- Official communications are sent exclusively from @anaplan.com email addresses
- Candidates are advised to verify the authenticity of job offers by contacting people@anaplan.com
Not specified


