Sephora is looking for a Senior Engineer, Site Reliability Engineering - Digital. In this role, you will be responsible for ensuring hyper-stable online experiences for millions of Sephora customers by monitoring, optimizing, and safeguarding the reliability of Sephora's Dotcom platform and OMNI services.
What You'll Do
- Operate and support the Dotcom and OMNI platform (including BOPIS and Same-Day Delivery), ensuring high availability, resilience, and hyper-stable customer experiences during normal operations and peak traffic events.
- Triage, diagnose, and resolve L2/L3 production incidents; lead post-incident reviews and partner with engineering teams on permanent corrective actions to eliminate root causes.
- Build automation solutions, reduce operational toil, and create AI-driven reliability tools and agentic workflows to improve mean time to resolution, productivity, and overall stability.
- Develop and optimize observability through logs, metrics, traces, dashboards, and anomaly detection; refine alerting and telemetry pipelines to proactively identify and resolve issues.
- Ensure world-class readiness for releases, seasonal events, feature launches, and traffic spikes through resiliency checks, performance validation, and comprehensive change reviews.
- Maintain and optimize SLO/SLI frameworks; monitor error budgets and partner with application teams on continuous reliability improvements.
What We're Looking For
- 6+ years of hands-on SRE, DevOps, or Production Engineering experience in high-scale digital applications.
- Strong understanding of reliability principles and operational excellence.
- Strong exposure to Azure AKS, Kubernetes, Docker, Service Mesh, and API-driven architectures.
- Operational support experience for React front-end and Spring Boot microservices in production environments.
- Hands-on experience with observability tools (Dynatrace, Splunk, Grafana, Prometheus).
- Strong scripting abilities (Python, Bash, PowerShell, YAML) to build automation.
- Proven experience in incident management, root cause analysis, and implementing permanent corrective actions.
- Experience with SRE principles, CI/CD pipelines (Jenkins, GitHub Actions), and cloud platforms (Azure required).
- Strong analytical and problem-solving abilities with clear communication skills under pressure.
Nice to Have
- Experience with AWS/GCP/OCI cloud platforms.
Technical Stack
- Azure AKS, Kubernetes, Docker, Service Mesh, React, Spring Boot, Dynatrace, Splunk, Grafana, Prometheus, Python, Bash, PowerShell, YAML, Jenkins, GitHub Actions
Benefits & Compensation
- Compensation: $155,070.00 - $172,300.00
- Healthcare plans including medical, dental, and vision coverage.
- Fully covered disability and life insurance.
- Competitive 401k with 4% match.
- FSA and HSA programs.
- Student Debt Retirement plan.
- Paid time off, sick paid time off, and protected leave.
- Access to development programs, tuition reimbursement, and mentorship.
- 30% discount on all merchandise/services.
- Opportunities for free product or 'gratis', and flash sale discounts on LVMH brand products.
- Free mental health and financial coaching resources with 24/7 access to Modern Health and Financial Finesse.
- Volunteer and donation matching.
Work Mode
This is a remote position open to candidates in Remote, CA 94105, United States (US).
Sephora values a diverse and inclusive workplace and considers all applicants without regard to sex, pregnancy, race, color, national origin, gender, age, religion, sexual orientation, military/veteran status, disability, or any other protected category.





