JPMorgan Chase & Co. is looking for a Site Reliability Engineer III to join our Cyber & Technology Controls organization. In this role, you will solve complex business problems with simple, maintainable solutions. You will configure, maintain, monitor, and optimize applications and infrastructure through code and cloud technologies, while actively sharing knowledge on operations, availability, reliability, and scalability.
What You'll Do
- Guide and assist others in building appropriate level designs and gaining consensus from peers.
- Collaborate with other software engineers and teams to design and implement deployment approaches using automated CI/CD pipelines.
- Collaborate with other software engineers and teams to design, develop, test, and implement availability, reliability, scalability, and solutions in their applications.
- Implement infrastructure, configuration, and network as code for the applications and platforms in your remit.
- Collaborate with technical experts, key stakeholders, and team members to resolve complex problems.
- Understand service level indicators and utilize service level objectives to proactively resolve issues before they impact customers.
- Support the adoption of site reliability engineering best practices within your team.
- Lead major incident response, root cause analysis, and blameless postmortems.
What We're Looking For
- Formal training or certification on software engineering concepts and 3+ years applied experience.
- Minimum of 3 years of experience working on support of products/infrastructure.
- Proficient in site reliability culture and principles and familiarity with how to implement site reliability within an application or platform.
- Proficient in at least one programming language such as Python or Java/Spring Boot.
- Proficient knowledge of software applications and technical processes within a given technical discipline (e.g., Cloud, artificial intelligence, etc.).
- Experience in observability such as white and black box monitoring, service level objective alerting, and telemetry collection using tools such as Grafana, Dynatrace, Prometheus, Datadog, Splunk, and others.
- Experience with continuous integration and continuous delivery tools like Jenkins, GitLab, or Terraform.
- Familiarity with container and container orchestration such as ECS, Kubernetes, and Docker.
- Experience on SLO/SLI definition, chaos engineering (Gremlin, Chaos Monkey), disaster recovery planning.
- Familiarity with troubleshooting common networking technologies and issues.
- Ability to contribute to large and collaborative teams by presenting information in a logical and timely manner with compelling language and limited supervision.
Nice to Have
- Hands-on experience with AWS / Azure / GCP or other cloud environments.
- Hands-on experience with Terraform or other infrastructure as code technologies.
- Hands-on experience with GitHub and code reviews.
- Hands-on experience with DevOps using Python, scripting for automation.
- Knowledge and hands on experience with tools like (Jira, Confluence, ServiceNow, Netcool).
- Experience leading and mentoring teams in SRE and DevOps practices.
Technical Stack
- Languages & Frameworks: Python, Java/Spring Boot
- Observability: Grafana, Dynatrace, Prometheus, Datadog, Splunk
- CI/CD & IaC: Jenkins, GitLab, Terraform
- Containers & Orchestration: ECS, Kubernetes, Docker
- Cloud: AWS, Azure, GCP
- Tools: GitHub, Jira, Confluence, ServiceNow, Netcool
Benefits & Compensation
- Comprehensive health care coverage
- On-site health and wellness centers
- Retirement savings plan
- Backup childcare
- Tuition reimbursement
- Mental health support
- Financial coaching
JPMorgan Chase & Co. is an Equal Opportunity Employer, including Disability/Veterans. We are an equal opportunity employer and place a high value on diversity and inclusion. We make reasonable accommodations for applicants’ and employees’ religious practices and beliefs, as well as mental health or physical disability needs.




