United States of America Remote (Global)

Hopper is hiring a Site Reliability Engineer

The Site Reliability Engineer will play a critical role in maintaining and improving the reliability, scalability, and performance of our cloud-native infrastructure. You will work closely with engineering teams to ensure systems are resilient, observable, and cost-efficient. This role combines deep technical expertise with operational discipline, requiring proactive problem-solving and a strong understanding of distributed systems. You will lead cost optimization initiatives, enhance monitoring and incident response, and contribute to platform-level decisions that impact the entire organization.

Responsibilities

  • Lead initiatives to enhance cost efficiency, such as minimizing network egress expenses by eliminating redundant data transfers.
  • Ensure data storage aligns with access patterns by using appropriate storage classes, including cold storage for infrequently accessed data.
  • Optimize autoscaling configurations for databases and compute resources to balance performance and cost.
  • Improve cost attribution systems so engineering teams have transparent and accurate insights into their cloud spending.
  • Respond to platform incidents as part of an on-call rotation and provide timely resolution support.
  • Assist engineers with infrastructure-related challenges and troubleshooting efforts.
  • Review and approve pull requests requiring platform-level oversight.
  • Collaborate within a small, high-performing team of SREs focused on scalable and reliable systems.

Requirements

  • Proven experience in site reliability engineering, DevOps, software engineering, or systems engineering.
  • Strong troubleshooting abilities in complex distributed systems.
  • Solid understanding of system design and strong analytical reasoning.
  • Effective communication skills for cross-team collaboration.
  • Familiarity with major cloud platforms, with a preference for Google Cloud.
  • Proficiency in SQL for data analysis and querying.
  • Hands-on experience with containers, Kubernetes, and configuration tools like Kustomize and Helm.
  • Knowledge of service mesh technologies, particularly Istio.
  • Understanding of networking concepts including DNS, TLS, certificates, and ingress routing.
  • Experience with observability tools such as Datadog for logs, metrics, and APM.
  • Working knowledge of security practices including IAM, RBAC, and network security.
  • Familiarity with authentication and authorization mechanisms.
  • Experience with CI/CD pipelines and automation.
  • Knowledge of database systems and their operational requirements.
  • Proficiency in scripting with Bash, Python, or similar languages.

Tech Stack

Google Cloud, Kubernetes, Kustomize, Helm, Istio, Datadog, SQL, Bash, Python, CI/CD, DNS, TLS, IAM, RBAC, APM, Containers

Benefits

  • Well-funded startup with significant growth ambitions.
  • Competitive compensation package.
  • Pre-IPO equity participation.
  • Unlimited paid time off.
  • Travel stipend through Carrot Cash.
  • On-demand access to co-working spaces via FlexDesk.
  • Work-from-home financial support.
  • Generous parental leave policy exceeding industry norms.
  • Direct access to company leadership and open communication channels.
  • High-impact roles within small, agile teams.
  • Employer-covered 100% of Medical, Dental, and Vision insurance.
  • Disability and Life insurance coverage.
  • Health Reimbursement Account (HRA) availability.
  • Access to Dependent Care Assistance (DCA/FSA) and 401k plans.

Compensation

competitive salary. Equity: pre-IPO equity packages. Unlimited PTO, travel stipend, work-from-home stipend, parental leave, HRA, DCA/FSA, 401k

Work Arrangement

global — America, Europe — Team is scattered across America and Europe, so you can sleep at night

Team

small and highly efficient team of SREs. Team is scattered across America and Europe

  • Entrepreneurial culture where pushing limits and taking risks is ev

Additional Information

  • This is a fully remote position with team members distributed across North and South America and Europe.
  • Candidates must be self-motivated and capable of working independently in an asynchronous environment.
  • Occasional travel to team meetups may be encouraged but is not required.
Required Skills
Google CloudKubernetesKustomizeHelmIstioDatadogSQLBashPythonCI/CDDNSTLSIAMRBACAPM Google CloudKubernetesKustomizeHelmIstioDatadogSQLBashPythonCI/CDDNSTLSIAMRBACAPM
About company
Hopper
Hopper is a leading travel platform that powers its mobile app, website, and B2B business (HTS) using data and machine learning. It offers travel agency services and proprietary fintech products like Cancel for Any Reason and Flight Disruption Assistance. Hopper serves hundreds of millions of travelers globally and partners with major brands like Capital One, Air Canada, and Uber through its HTS division to integrate fintech and travel inventory into their direct channels.
All jobs at Hopper Visit website
Job Details
Department Information Technology
Category infrastructure
Posted 3 months ago