About the Role We are searching for a Site Reliability Engineer (SRE) who demonstrates dedication to sustaining dependable cloud-based infrastructure. Your responsibility will involve supporting Red Hat's software production services across our hybrid cloud environment. You'll collaborate closely with development, quality engineering, and release engineering teams to maintain the health of our service infrastructure. Your daily activities will encompass establishing service monitoring, enhancing automation capabilities, implementing security protocols, and addressing diverse service challenges. Engaging with professional communities, you'll help shape our hybrid cloud platform's design and share accountability for defining and tracking Service Level Indicators (SLIs) and Service Level Objectives (SLOs). In this position, you'll be required to promptly respond during service disruptions and participate in learning sessions focused on improving service resilience. Join our team committed to developing world-class open source software. What You'll Accomplish - Operate within a globally distributed team providing continuous support through strategic time zone coverage and structured on-call rotations - Address service incidents using established procedures, investigate disruption origins, and coordinate resolution across multiple service teams - Participate in incident review processes and implement corrective measures - Manage and configure service infrastructure - Proactively reduce operational complexity by automating repetitive and error-prone processes - Synchronize efforts with various Red Hat technical teams to ensure cloud deployment meets rigorous quality standards - Develop comprehensive monitoring, alert, and escalation strategies for infrastructure performance and availability issues - Collaborate with service owners to establish, implement, and maintain precise SLIs and SLOs Required Qualifications - Proficiency in OpenShift administration - Advanced Linux administration skills - Foundational understanding of AWS technologies - Experience with CI/CD platforms like Tekton, Pipelines as Code, potentially GitHub Actions or Jenkins - Expertise in automation tools such as Ansible or Terraform - Familiarity with open source monitoring technologies (Grafana, Prometheus, OpenTelemetry) - Exceptional English communication capabilities for effective global team collaboration Advantageous Skills - Prior experience with SRE methodological approaches - Software development background in Python or GoLang #LI-EK1 About Red Hat Red Hat is a global leader in enterprise open source software solutions, leveraging community-driven approaches to deliver cutting-edge Linux, cloud, container, and Kubernetes technologies. Spanning 40+ countries, our workforce operates flexibly across various work environments. We cultivate an inclusive culture where innovative ideas are welcomed regardless of an individual's role or tenure. Inclusion at Red Hat Our organizational culture embodies open source principles of transparency, collaboration, and inclusivity. We believe transformative ideas can emerge from anywhere, empowering diverse perspectives to challenge conventions and drive innovation. We are committed to providing equal opportunities and celebrating every individual's unique contributions. Equal Opportunity Policy Red Hat is an equal opportunity employer, evaluating candidates without discrimination based on race, color, religion, gender, sexual orientation, national origin, age, veteran status, disability, or any legally protected characteristic.
Remote (Global) Full-time
Red Hat is hiring a Site Reliability Engineer
Your first international client?
Don't lose them over invoicing
Clients ghost freelancers with unprofessional invoicing. Glopay gives you a real EU company partnership so they take you seriously from invoice #1.
Instant EU company partnership
Invoice builder with your branding
Automated payment reminders
Real-time payment tracking
Ready in 24 hours



