The Site Reliability Engineer will play a critical role in maintaining and improving the reliability, scalability, and performance of our cloud-native infrastructure. You will work closely with engineering teams to ensure systems are resilient, observable, and cost-efficient. This role combines deep technical expertise with operational discipline, requiring proactive problem-solving and a strong understanding of distributed systems. You will lead cost optimization initiatives, enhance monitoring and incident response, and contribute to platform-level decisions that impact the entire organization.

Responsibilities

Lead initiatives to enhance cost efficiency, such as minimizing network egress expenses by eliminating redundant data transfers.
Ensure data storage aligns with access patterns by using appropriate storage classes, including cold storage for infrequently accessed data.
Optimize autoscaling configurations for databases and compute resources to balance performance and cost.
Improve cost attribution systems so engineering teams have transparent and accurate insights into their cloud spending.
Respond to platform incidents as part of an on-call rotation and provide timely resolution support.
Assist engineers with infrastructure-related challenges and troubleshooting efforts.
Review and approve pull requests requiring platform-level oversight.
Collaborate within a small, high-performing team of SREs focused on scalable and reliable systems.

Requirements

Proven experience in site reliability engineering, DevOps, software engineering, or systems engineering.
Strong troubleshooting abilities in complex distributed systems.
Solid understanding of system design and strong analytical reasoning.
Effective communication skills for cross-team collaboration.
Familiarity with major cloud platforms, with a preference for Google Cloud.
Proficiency in SQL for data analysis and querying.
Hands-on experience with containers, Kubernetes, and configuration tools like Kustomize and Helm.
Knowledge of service mesh technologies, particularly Istio.
Understanding of networking concepts including DNS, TLS, certificates, and ingress routing.
Experience with observability tools such as Datadog for logs, metrics, and APM.
Working knowledge of security practices including IAM, RBAC, and network security.
Familiarity with authentication and authorization mechanisms.
Experience with CI/CD pipelines and automation.
Knowledge of database systems and their operational requirements.
Proficiency in scripting with Bash, Python, or similar languages.

Tech Stack

Google Cloud, Kubernetes, Kustomize, Helm, Istio, Datadog, SQL, Bash, Python, CI/CD, DNS, TLS, IAM, RBAC, APM, Containers

Benefits

Well-funded startup with significant growth ambitions.
Competitive compensation package.
Pre-IPO equity participation.
Unlimited paid time off.
Travel stipend through Carrot Cash.
On-demand access to co-working spaces via FlexDesk.
Work-from-home financial support.
Generous parental leave policy exceeding industry norms.
Direct access to company leadership and open communication channels.
High-impact roles within small, agile teams.
Employer-covered 100% of Medical, Dental, and Vision insurance.
Disability and Life insurance coverage.
Health Reimbursement Account (HRA) availability.
Access to Dependent Care Assistance (DCA/FSA) and 401k plans.

Compensation

competitive salary. Equity: pre-IPO equity packages. Unlimited PTO, travel stipend, work-from-home stipend, parental leave, HRA, DCA/FSA, 401k

Work Arrangement

global — America, Europe — Team is scattered across America and Europe, so you can sleep at night

Team

small and highly efficient team of SREs. Team is scattered across America and Europe

Entrepreneurial culture where pushing limits and taking risks is ev

Additional Information

This is a fully remote position with team members distributed across North and South America and Europe.
Candidates must be self-motivated and capable of working independently in an asynchronous environment.
Occasional travel to team meetups may be encouraged but is not required.

Hopper is hiring a Site Reliability Engineer

Responsibilities

Requirements

Tech Stack

Benefits

Compensation

Work Arrangement

Team

Additional Information

Similar Jobs

Lead Data Platform Engineer

Senior DevOps Engineer

Cloud Platform Engineer

Site Reliability Engineer (SRE/ DevOps) - Engineering Productivity

Contract: AI Operations Specialist

DevOps & Site Reliability Engineer

Related Articles

Become an AI Developer: Your Career Guide

Kubernetes Remote Jobs: AI & Cloud-Native Careers in 2026

CI/CD Testing Tools: 23 Best Options for 2026

Hopper is hiring a Site Reliability Engineer

Responsibilities

Requirements

Tech Stack

Benefits

Compensation

Work Arrangement

Team

Additional Information

Similar Jobs

Lead Data Platform Engineer

Senior DevOps Engineer

Cloud Platform Engineer

Site Reliability Engineer (SRE/ DevOps) - Engineering Productivity

Contract: AI Operations Specialist

DevOps &amp; Site Reliability Engineer

Related Articles

Become an AI Developer: Your Career Guide

Kubernetes Remote Jobs: AI & Cloud-Native Careers in 2026

CI/CD Testing Tools: 23 Best Options for 2026

DevOps & Site Reliability Engineer