Snowflake is hiring a Staff Software Engineer for its Production Engineering Team. You will be central to driving the reliability tools and processes that ensure Snowflake consistently delivers a top-tier customer experience. Your work will champion Service Level Objectives (SLOs), build infrastructure for rapid issue detection, and deeply engage in system health verification.
What You'll Do
- Lead the improvement of the whole lifecycle of services—from inception and design, deployment, operation, and refinement.
- Drive scaling systems sustainably by automation; Drive changes that improve reliability and velocity.
- Establish and practice low noise incident response rotations and blameless postmortems to prevent problem recurrence.
- Write and review code. Develop documentation and capacity plans, and debug the hardest problems on large distributed systems.
- Collaborate with software engineers to establish, maintain, and optimize functional and performance SLOs.
- Participate in a 24x1 on-call rotation.
What We're Looking For
- Bachelor's degree in Computer Science, a related technical field involving software engineering, or equivalent practical experience.
- Proficient in at least one modern programming language, preferably Golang.
- Systematic problem-solving methods, effective communication skills.
Nice to Have
- 10+ years industry experience designing, building and supporting large scale systems in production.
- Experience in modern observability tools and production monitoring practices.
- Experience with capacity and load testing of the distributed applications.
- Experience with containers and container orchestration systems such as Kubernetes.
- Experience in deploying, managing, and operating scalable and fault tolerant Linux infrastructure.
- Experience with the SLO-driven reliability management processes.
- Hands on experience with one of more public cloud providers (AWS, Azure, or GCP).
- Ability to spot systematic issues, define roadmaps and guide other engineers to resolve them.
Technical Stack
- Golang
- Kubernetes
- Linux
- AWS
- Azure
- GCP
Team & Environment
You will be part of the Production Engineering Team, focused on reliability and operational excellence. We foster a culture that is all in on impact, innovation, and collaboration and emphasizes learning from every incident.
Snowflake is an equal opportunity employer.

