Director, Site Reliability Engineering at Mastercard (Expired)

Mastercard is looking for a Director, Site Reliability Engineering to lead the vision, strategy, and execution of the Infrastructure SRE organization supporting our mission-critical Payment Networks applications. In this role, you will manage a team of highly skilled SRE infrastructure engineers dedicated to maintaining the reliability and performance of our core systems.

What You'll Do

Lead the vision, strategy, and execution of the Infrastructure SRE organization supporting mission-critical Payment Networks applications, ensuring alignment with business and platform roadmaps.
Provide strong technical leadership by driving high-level architectural discussions, influencing cross-functional teams, and shaping scalable, secure, and highly available infrastructure solutions.
Mentor, develop, and support engineers across skill levels, overseeing team meetings, performance management, and long-term career development plans.
Establish, track, and report on key team OKRs and KPIs that support broader business objectives, infrastructure health, and operational maturity.
Foster a culture of innovation, collaboration, and continuous improvement across engineering and operational teams.
Drive governance, enterprise standards, compliance, and operational excellence to increase platform scalability, uptime, availability, and resiliency.
Advance observability and telemetry capabilities to enable proactive monitoring, intelligent alerting, automated remediation, and improved root cause analysis.
Champion reliability engineering best practices—including chaos engineering, capacity planning, and incident management—to reduce operational risk and service disruption.
Partner closely with Product, Architecture, Security, and Development teams to ensure infrastructure design and operational frameworks support current and future business needs.
Own and optimize incident response frameworks, post-incident reviews, and reliability KPIs to continuously reduce incident frequency, impact, and mean time to recovery.
Oversee budget planning, resource allocation, and vendor evaluations to ensure cost-effective and scalable infrastructure investments.

What We're Looking For

5–10 years of experience as a technology leader in Site Reliability Engineering, Infrastructure Operations, or delivering large-scale infrastructure solutions.
Strong people and performance management skills, with a demonstrated ability to coach, mentor, and motivate high-performing technical teams.
Proven experience driving a culture of accountability, continuous improvement, and operational excellence.
Deep knowledge of core infrastructure technologies, including database, compute, storage, networking, cloud platforms, virtualization, and containerization.
Strong understanding of infrastructure architecture principles, including lifecycle management, governance, and operational readiness.
Ability to lead teams through complex technical problems, with a proven track record in root cause analysis across multi-disciplinary engineering groups.
Strong working knowledge of ITIL best practices, including Change, Incident, Problem, and Service Management.
Demonstrated experience improving operational processes, reducing incident noise, and enhancing system reliability and availability.
Skilled in driving data-driven operational decisions, using SLIs/SLOs, KPIs, and service health metrics.
Knowledge of SRE principles, including automation, observability, monitoring, capacity management, and resilience engineering.
Experience implementing infrastructure-as-code, automation frameworks, and initiatives that reduce toil and enhance stability.
Excellent communication skills with the ability to translate complex technical issues into clear, actionable information for senior leaders and non-technical stakeholders.
Strong collaboration mindset with a history of partnering effectively across Product, Engineering, Architecture, and Security teams.
Demonstrated success leading teams through large-scale change initiatives, platform migrations, cloud adoption, or major service transformations.

Team & Environment

You will lead the Payments Network SRE team, a group dedicated to the reliability of our most critical transaction systems. Mastercard's culture is built on inclusion, innovation, and collaboration.

Mastercard was looking for a Director, Site Reliability Engineering

What You'll Do

What We're Looking For

Team & Environment

Similar Jobs

Senior Forward Deploy Engineer

Staff Software Engineer - Compute Infrastructure

Systems Administrator

Enterprise Architect

Senior Site Reliability Engineer - B2B

Software Engineers Python / Devops