JPMorgan Chase & Co. is looking for a Lead Site Reliability Engineer to assume a critical role in defining the future of our globally recognized firm. You will hold a leadership position within your team, demonstrate strong knowledge across multiple technical domains, and provide advice and mentorship on technical and business issues.
What You'll Do
- Demonstrate and champion site reliability culture and practices, exerting technical influence throughout your team.
- Lead initiatives to improve the reliability and stability of your team’s applications and platforms using data-driven analytics to improve service levels.
- Collaborate with team members to identify comprehensive service level indicators and work with stakeholders to establish reasonable service level objectives and error budgets.
- Demonstrate a high level of technical expertise within one or more technical domains and proactively identify and solve technology-related bottlenecks.
- Act as the main point of contact during major incidents for your application, demonstrating the skills to identify and solve issues quickly.
- Document and share knowledge within your organization via internal forums and communities of practice.
What We're Looking For
- Formal training or certification in software engineering concepts with 10+ years of applied experience.
- Deep proficiency in reliability, scalability, performance, security, enterprise system architecture, toil reduction, and other SRE best practices.
- Fluency in at least one programming language such as Python, Java Spring Boot, or .Net.
- Deep knowledge of software applications and technical processes with emerging depth in one or more technical disciplines.
- Proficiency and experience in observability using tools such as Grafana, Dynatrace, Prometheus, Datadog, or Splunk.
- Proficiency in continuous integration and continuous delivery tools (e.g., Jenkins, GitLab, Terraform).
- Experience with container and container orchestration (e.g., ECS, Kubernetes, Docker).
- Experience with troubleshooting common networking technologies and issues.
- Ability to identify and solve problems related to complex data structures and algorithms.
- Drive to self-educate and evaluate new technology.
Nice to Have
- Ability to teach new programming languages to team members.
- Ability to expand and collaborate across different levels and stakeholder groups.
Technical Stack
- Languages: Python, Java Spring Boot, .Net
- Observability: Grafana, Dynatrace, Prometheus, Datadog, Splunk
- CI/CD & IaC: Jenkins, GitLab, Terraform
- Containers & Orchestration: ECS, Kubernetes, Docker
Benefits & Compensation
- Comprehensive health care coverage
- On-site health and wellness centers
- Retirement savings plan
- Backup childcare
- Tuition reimbursement
- Mental health support
- Financial coaching
- Base salary determined based on the role, experience, skill set, and location.
We are an equal opportunity employer and place a high value on diversity and inclusion at our company.




