Brno, Czech Republic Remote (Country)

Bloomreach is hiring a Senior Site Reliability Engineer, IMF

As a Senior Site Reliability Engineer, you will play a key role in maintaining and enhancing the performance, availability, and scalability of our core in-memory database (IMF) and associated services. Hosted on Google Cloud Platform and orchestrated through Kubernetes, these systems are central to our analytics infrastructure.

Key Responsibilities

Configure and manage Kubernetes components to ensure resilient, high-performing systems.
Respond to incidents, conduct root cause analysis, and implement preventive measures.
Participate in a rotating on-call schedule with one-week shifts to support 24/7 system reliability.
Develop automation tools and scripts in Python and Go to streamline operations and reduce manual intervention.
Monitor system health using tools like VictoriaMetrics and Grafana, proactively identifying and resolving issues.
Plan capacity and ensure resource availability during high-demand periods.
Maintain robust logging and monitoring setups to enable fast detection and diagnosis of problems.
Ensure reliable backup strategies and efficient recovery processes for critical databases.
Collaborate with software engineers and product managers to deliver scalable solutions on time.
Partner with L2 support teams to improve operational workflows and issue resolution.

Required Expertise

Proven experience in DevOps or Site Reliability Engineering roles.
Strong understanding of DevOps practices and principles.
Hands-on experience with Google Cloud Platform and Kubernetes (GKE).
Solid background in building and managing CI/CD pipelines, particularly with GitLab.
Proficiency in scripting languages such as Python, Go, or Shell for automation tasks.
Experience with monitoring solutions including VictoriaMetrics, Grafana, InfluxDB, and Chronograf.
Familiarity with logging systems and distributed tracing tools.
Track record in incident management and post-mortem analysis.
Excellent problem-solving abilities and attention to detail.
Strong communication skills, with experience working in distributed, remote teams.
Ability to operate independently and prioritize tasks effectively in a fast-moving environment.

Technology Environment

You'll work with a modern stack including IMF (a C++-based in-memory database), Apache Kafka, MongoDB, Kubernetes on GCP, gRPC, Python, Go, GitLab, VictoriaMetrics, Grafana, InfluxDB, Chronograf, and Sentry.

Work Model

This role supports remote work within the Central European Time Zone. While the team operates remotely, there are opportunities to meet in person in Brno, Prague (Czechia), or Bratislava (Slovakia).

Company Culture

We foster a creative, collaborative, and customer-focused environment where technical teams are empowered to innovate. Our culture values strategic thinking, adaptability, and continuous improvement, all within a fast-paced, AI-driven landscape.

Required Skills

GitLabGrafanaInfluxDBKafkaMongoDBKubernetesDevOpsCI/CDGCP

Invoicing holding you back?

Focus on work, not paperwork

Stop worrying about invoicing, taxes, and compliance. Glopay handles the business setup, you handle the client work. Get paid faster and look professional.

Auto-generated compliant invoices

Built-in expense management

Income reports for tax season

95% of earnings stay with you

Try Glopay free

No credit card needed

About company

Loomi AI, Bloomreach's agentic platform, understands each customer to personalize their experience in real time — across email, web, mobile, and search. The platform connects first-party customer and product data with business metrics to deliver intelligent personalization at scale.

Bloomreach powers AI-driven marketing automation, ecommerce search, and conversational shopping experiences, helping brands increase revenue, loyalty, and conversion rates across 13+ channels.

All jobs at Bloomreach Visit website

Job Details

Department Analytics

Category DevOps & SRE

Posted 3 months ago